Skip to content

Commit

Permalink
Merge pull request IQSS#10590 from IQSS/10570-search-improvements
Browse files Browse the repository at this point in the history
Extra settings for limiting search facets 10570
  • Loading branch information
sekmiller authored Jul 1, 2024
2 parents 2917dbe + 9d2a14a commit 72a2e89
Show file tree
Hide file tree
Showing 6 changed files with 98 additions and 10 deletions.
4 changes: 4 additions & 0 deletions doc/release-notes/10570-extra-facet-settings.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
Extra settings have been added giving an instance admin more choices in
selectively limiting the availability of search facets on the Collection and Dataset pages.
See the [Disable Solr Facets](https://guides.dataverse.org/en/6.3/installation/config.html#DisableSolrFacets) sections of the Config Guide for more info.

19 changes: 18 additions & 1 deletion doc/sphinx-guides/source/installation/config.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3867,7 +3867,7 @@ If ``:SolrFullTextIndexing`` is set to true, the content of files of any size wi
:DisableSolrFacets
++++++++++++++++++

Setting this to ``true`` will make the collection ("dataverse") page start showing search results without the usual search facets on the left side of the page. A message will be shown in that column informing the users that facets are temporarily unavailable. Generating the facets is more resource-intensive for Solr than the main search results themselves, so applying this measure will significantly reduce the load on the search engine when its performance becomes an issue.
Setting this to ``true`` will make the collection ("dataverse") page start showing search results without the usual search facets on the left side of the page. A message will be shown in that column informing the users that facets are temporarily unavailable. Generating the facets may in some cases be more resource-intensive for Solr than the main search results themselves, so applying this measure will significantly reduce the load on the search engine when its performance becomes an issue.

This setting can be used in combination with the "circuit breaker" mechanism on the Solr side (see the "Installing Solr" section of the Installation Prerequisites guide). An admin can choose to enable it, or even create an automated system for enabling it in response to Solr beginning to drop incoming requests with the HTTP code 503.

Expand All @@ -3876,6 +3876,23 @@ To enable the setting::
curl -X PUT -d true "http://localhost:8080/api/admin/settings/:DisableSolrFacets"


:DisableSolrFacetsForGuestUsers
+++++++++++++++++++++++++++++++

Similar to the above, but will disable the facets for Guest (unauthenticated) users only.

:DisableSolrFacetsWithoutJsession
+++++++++++++++++++++++++++++++++

Same idea as with the 2 settings above. For the purposes of this setting, a request is considered "anonymous", if it came in without the JSESSION cookie supplied. A UI user who is browsing the holdings without logging in will have a valid JSESSION cookie, tied to a guest session. The main purpose of this setting is to hide the facets from bots, scripted crawlers and such (most of which - though not all - do not use cookies). Not letting the bots anywhere near the facets can serve a dual purpose on a busy instance experiencing problems with such abuse - some CPU cycles and resources can be saved by not having to generate the facets. And, even more importantly, it can prevent bots from attempting to crawl the facet trees, which has a potential for multiplying the service load.

.. _:DisableUncheckedTypesFacet:

:DisableUncheckedTypesFacet
+++++++++++++++++++++++++++

Another option for reducing the load on solr on a busy instance. Rather than disabling all the search facets, this setting affects only one - the facet on the upper left of the collection page, where users can select the type of objects to search - Collections ("Dataverses"), Datasets and/or Files. With this option set to true, the numbers of results will only be shown for the types actually selected (i.e. only for the search results currently shown to the user). This minor feature - being able to tell the user how many files (for example) they *would* find, *if* they chose to search for files, by clicking the Files facet - essentially doubles the expense of running the search. That may still be negligible on an instance with lighter holdings, but can make a significant difference for a large and heavily used archive.

.. _:SignUpUrl:

:SignUpUrl
Expand Down
21 changes: 21 additions & 0 deletions src/main/java/edu/harvard/iq/dataverse/DatasetPage.java
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@
import edu.harvard.iq.dataverse.authorization.users.ApiToken;
import edu.harvard.iq.dataverse.authorization.users.AuthenticatedUser;
import edu.harvard.iq.dataverse.authorization.users.PrivateUrlUser;
import edu.harvard.iq.dataverse.authorization.users.GuestUser;
import edu.harvard.iq.dataverse.authorization.users.User;
import edu.harvard.iq.dataverse.branding.BrandingUtil;
import edu.harvard.iq.dataverse.dataaccess.StorageIO;
Expand Down Expand Up @@ -138,6 +139,7 @@
import jakarta.faces.event.AjaxBehaviorEvent;
import jakarta.servlet.ServletOutputStream;
import jakarta.servlet.http.HttpServletResponse;
import jakarta.servlet.http.HttpServletRequest;

import org.apache.commons.text.StringEscapeUtils;
import org.apache.commons.lang3.mutable.MutableBoolean;
Expand Down Expand Up @@ -787,6 +789,25 @@ public boolean isIndexedVersion() {
return isIndexedVersion = false;
}

// plus we have mechanisms for disabling the facets selectively, just for
// the guests, or anonymous users:
if (session.getUser() instanceof GuestUser) {
if (settingsWrapper.isTrueForKey(SettingsServiceBean.Key.DisableSolrFacetsForGuestUsers, false)) {
return isIndexedVersion = false;
}

// An even lower grade of user than Guest is a truly anonymous user -
// a guest user who came without the session cookie:
Map<String, Object> cookies = FacesContext.getCurrentInstance().getExternalContext().getRequestCookieMap();
if (!(cookies != null && cookies.containsKey("JSESSIONID"))) {
if (settingsWrapper.isTrueForKey(SettingsServiceBean.Key.DisableSolrFacetsWithoutJsession, false)) {
return isIndexedVersion = false;
}
}

}


// The version is SUPPOSED to be indexed if it's the latest published version, or a
// draft. So if none of the above is true, we can return false right away.
if (!(workingVersion.isDraft() || isThisLatestReleasedVersion())) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@
import edu.harvard.iq.dataverse.SettingsWrapper;
import edu.harvard.iq.dataverse.ThumbnailServiceWrapper;
import edu.harvard.iq.dataverse.WidgetWrapper;
import edu.harvard.iq.dataverse.authorization.users.GuestUser;
import edu.harvard.iq.dataverse.engine.command.DataverseRequest;
import edu.harvard.iq.dataverse.settings.SettingsServiceBean;
import edu.harvard.iq.dataverse.util.BundleUtil;
Expand Down Expand Up @@ -395,7 +396,7 @@ The real issue here (https://github.com/IQSS/dataverse/issues/7304) is caused
}
}

if (!wasSolrErrorEncountered() && selectedTypesList.size() < 3 && !isSolrTemporarilyUnavailable() && !isFacetsDisabled()) {
if (!wasSolrErrorEncountered() && selectedTypesList.size() < 3 && !isSolrTemporarilyUnavailable() && !isFacetsDisabled() && !isUncheckedTypesFacetDisabled()) {
// If some types are NOT currently selected, we will need to
// run a second search to obtain the numbers of the unselected types:

Expand Down Expand Up @@ -1086,20 +1087,59 @@ public void setSolrTemporarilyUnavailable(boolean solrIsTemporarilyUnavailable)
this.solrIsTemporarilyUnavailable = solrIsTemporarilyUnavailable;
}

Boolean solrFacetsDisabled = null;
/**
* Indicates that the fragment should not be requesting facets in Solr
* searches and rendering them on the page.
* @return true if disabled; false by default
*/
public boolean isFacetsDisabled() {
// The method is used in rendered="..." logic. So we are using
// SettingsWrapper to make sure we are not looking it up repeatedly
// (settings are not expensive to look up, but
// still).
if (this.solrFacetsDisabled != null) {
return this.solrFacetsDisabled;
}

if (settingsWrapper.isTrueForKey(SettingsServiceBean.Key.DisableSolrFacets, false)) {
return this.solrFacetsDisabled = true;
}

// We also have mechanisms for disabling the facets selectively, just for
// the guests, or anonymous users:
if (session.getUser() instanceof GuestUser) {
if (settingsWrapper.isTrueForKey(SettingsServiceBean.Key.DisableSolrFacetsForGuestUsers, false)) {
return this.solrFacetsDisabled = true;
}

// An even lower grade of user than Guest is a truly anonymous user -
// a guest user who came without the session cookie:
Map<String, Object> cookies = FacesContext.getCurrentInstance().getExternalContext().getRequestCookieMap();
if (!(cookies != null && cookies.containsKey("JSESSIONID"))) {
if (settingsWrapper.isTrueForKey(SettingsServiceBean.Key.DisableSolrFacetsWithoutJsession, false)) {
return this.solrFacetsDisabled = true;
}
}
}

return settingsWrapper.isTrueForKey(SettingsServiceBean.Key.DisableSolrFacets, false);
return this.solrFacetsDisabled = false;
}

Boolean disableSecondPassSearch = null;

/**
* Indicates that we do not need to run the second search query to populate
* the counts for *unchecked* type facets.
* @return true if disabled; false by default
*/
public boolean isUncheckedTypesFacetDisabled() {
if (this.disableSecondPassSearch != null) {
return this.disableSecondPassSearch;
}
if (settingsWrapper.isTrueForKey(SettingsServiceBean.Key.DisableUncheckedTypesFacet, false)) {
return this.disableSecondPassSearch = true;
}
return this.disableSecondPassSearch = false;
}


public boolean isRootDv() {
return rootDv;
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -671,6 +671,9 @@ Whether Harvesting (OAI) service is enabled
* and dataset pages instantly
*/
DisableSolrFacets,
DisableSolrFacetsForGuestUsers,
DisableSolrFacetsWithoutJsession,
DisableUncheckedTypesFacet,
/**
* When ingesting tabular data files, store the generated tab-delimited
* files *with* the variable names line up top.
Expand Down
9 changes: 6 additions & 3 deletions src/main/webapp/search-include-fragment.xhtml
Original file line number Diff line number Diff line change
Expand Up @@ -132,9 +132,10 @@

<span class="icon-dataverse text-icon-inline"></span>

<h:outputFormat styleClass="facetTypeDataverse" value="#{bundle['dataverse.results.types.dataverses']} &#40;{0}&#41;">
<h:outputFormat rendered="#{!SearchIncludeFragment.uncheckedTypesFacetDisabled or SearchIncludeFragment.selectedTypesList.contains('dataverses')}" styleClass="facetTypeDataverse" value="#{bundle['dataverse.results.types.dataverses']} &#40;{0}&#41;">
<f:param value="#{SearchIncludeFragment.facetCountDataverses}"/>
</h:outputFormat>
<h:outputText rendered="#{SearchIncludeFragment.uncheckedTypesFacetDisabled and !SearchIncludeFragment.selectedTypesList.contains('dataverses')}" value="#{bundle['dataverse.results.types.dataverses']}" styleClass="facetTypeDataverse"/>
</h:outputLink>
</div>
<!--DATASETS TOGGLE-->
Expand Down Expand Up @@ -164,9 +165,10 @@

<span class="icon-dataset text-icon-inline"></span>

<h:outputFormat styleClass="facetTypeDataset" value="#{bundle['dataverse.results.types.datasets']} &#40;{0}&#41;">
<h:outputFormat rendered="#{!SearchIncludeFragment.uncheckedTypesFacetDisabled or SearchIncludeFragment.selectedTypesList.contains('datasets')}" styleClass="facetTypeDataset" value="#{bundle['dataverse.results.types.datasets']} &#40;{0}&#41;">
<f:param value="#{SearchIncludeFragment.facetCountDatasets}"/>
</h:outputFormat>
<h:outputText rendered="#{SearchIncludeFragment.uncheckedTypesFacetDisabled and !SearchIncludeFragment.selectedTypesList.contains('datasets')}" value="#{bundle['dataverse.results.types.datasets']}" styleClass="facetTypeDataset"/>
</h:outputLink>
</div>
<!--FILES TOGGLE-->
Expand Down Expand Up @@ -196,9 +198,10 @@

<span class="icon-file text-icon-inline"></span>

<h:outputFormat styleClass="facetTypeFile" value="#{bundle['dataverse.results.types.files']} &#40;{0}&#41;">
<h:outputFormat rendered="#{!SearchIncludeFragment.uncheckedTypesFacetDisabled or SearchIncludeFragment.selectedTypesList.contains('files')}" styleClass="facetTypeFile" value="#{bundle['dataverse.results.types.files']} &#40;{0}&#41;">
<f:param value="#{SearchIncludeFragment.facetCountFiles}"/>
</h:outputFormat>
<h:outputText rendered="#{SearchIncludeFragment.uncheckedTypesFacetDisabled and !SearchIncludeFragment.selectedTypesList.contains('files')}" value="#{bundle['dataverse.results.types.files']}" styleClass="facetTypeFile"/>
</h:outputLink>
</div>
</h:form>
Expand Down

0 comments on commit 72a2e89

Please sign in to comment.