Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Solr 9.7.0 and other lib updates from QDR #10713

Open
wants to merge 11 commits into
base: develop
Choose a base branch
from
4 changes: 2 additions & 2 deletions .env
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
APP_IMAGE=gdcc/dataverse:unstable
POSTGRES_VERSION=17
DATAVERSE_DB_USER=dataverse
SOLR_VERSION=9.3.0
SKIP_DEPLOY=0
SOLR_VERSION=9.7.0
SKIP_DEPLOY=0
21 changes: 20 additions & 1 deletion conf/solr/solrconfig.xml
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@
that you fully re-index after changing this setting as it can
affect both how text is indexed and queried.
-->
<luceneMatchVersion>9.7</luceneMatchVersion>
<luceneMatchVersion>9.11</luceneMatchVersion>

<!-- <lib/> directives can be used to instruct Solr to load any Jars
identified and use them to resolve any "plugins" specified in
Expand Down Expand Up @@ -360,6 +360,21 @@
-->
<maxBooleanClauses>${solr.max.booleanClauses:1024}</maxBooleanClauses>

<!-- Minimum acceptable prefix-size for prefix-based queries.

Prefix-based queries consume memory in proportion to the number of terms in the index
that start with that prefix. Short prefixes tend to match many many more indexed-terms
and consume more memory as a result, sometimes causing stability issues on the node.

This setting allows administrators to require that prefixes meet or exceed a specified
minimum length requirement. Prefix queries that don't meet this requirement return an
error to users. The limit may be overridden on a per-query basis by specifying a
'minPrefixQueryTermLength' local-param value.

The flag value of '-1' can be used to disable enforcement of this limit.
-->
<minPrefixQueryTermLength>${solr.query.minPrefixLength:-1}</minPrefixQueryTermLength>

<!-- Solr Internal Query Caches
Starting with Solr 9.0 the default cache implementation used is CaffeineCache.
-->
Expand Down Expand Up @@ -1015,6 +1030,10 @@
<str name="pattern">[^\w-\.]</str>
<str name="replacement">_</str>
</updateProcessor>
<updateProcessor class="solr.NumFieldLimitingUpdateRequestProcessorFactory" name="max-fields">
<int name="maxFields">1000</int>
<bool name="warnOnly">true</bool>
</updateProcessor>
<updateProcessor class="solr.ParseBooleanFieldUpdateProcessorFactory" name="parse-boolean"/>
<updateProcessor class="solr.ParseLongFieldUpdateProcessorFactory" name="parse-long"/>
<updateProcessor class="solr.ParseDoubleFieldUpdateProcessorFactory" name="parse-double"/>
Expand Down
10 changes: 10 additions & 0 deletions doc/release-notes/10713-Solr9.7.0 and lib updates.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
Solr 9.7.0 is now the version recommended in our installation guides and used with automated testing. Other libraries Dataverse uses have been updated as well.

For the upgrade instructions section:

[note that 6.5 may contain other solr-related changes, so the instructions may need to contain information merged from multiple release notes!]

If you are upgrading Solr:
- Install solr-9.7.0 following the instructions from the Installation guide.
- Run a full reindex to populate the search catalog.
- Note that it may be possible to skip the reindexing step by simply moving the existing `.../server/solr/collection1/` under the new `solr-9.7.0` installation directory. This however has not been thoroughly tested and is not officially supported.
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
# chkconfig: 35 92 08
# description: Starts and stops Apache Solr

SOLR_DIR="/usr/local/solr/solr-9.4.1"
SOLR_DIR="/usr/local/solr/solr-9.7.0"
SOLR_COMMAND="bin/solr"
SOLR_ARGS="-m 1g"
SOLR_USER=solr
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,9 @@ After = syslog.target network.target remote-fs.target nss-lookup.target
[Service]
User = solr
Type = forking
WorkingDirectory = /usr/local/solr/solr-9.4.1
ExecStart = /usr/local/solr/solr-9.4.1/bin/solr start -m 1g
ExecStop = /usr/local/solr/solr-9.4.1/bin/solr stop
WorkingDirectory = /usr/local/solr/solr-9.7.0
ExecStart = /usr/local/solr/solr-9.7.0/bin/solr start -m 1g
ExecStop = /usr/local/solr/solr-9.7.0/bin/solr stop
LimitNOFILE=65000
LimitNPROC=65000
Restart=on-failure
Expand Down
2 changes: 1 addition & 1 deletion doc/sphinx-guides/source/developers/classic-dev-env.rst
Original file line number Diff line number Diff line change
Expand Up @@ -136,7 +136,7 @@ On Linux, you should just install PostgreSQL using your favorite package manager
Install Solr
^^^^^^^^^^^^

`Solr <https://lucene.apache.org/solr/>`_ 9.4.1 is required.
`Solr <https://lucene.apache.org/solr/>`_ 9.7.0 is required.

Follow the instructions in the "Installing Solr" section of :doc:`/installation/prerequisites` in the main Installation guide.

Expand Down
16 changes: 8 additions & 8 deletions doc/sphinx-guides/source/installation/prerequisites.rst
Original file line number Diff line number Diff line change
Expand Up @@ -163,7 +163,7 @@ The Dataverse software search index is powered by Solr.
Supported Versions
==================

The Dataverse software has been tested with Solr version 9.4.1. Future releases in the 9.x series are likely to be compatible. Please get in touch (:ref:`support`) if you are having trouble with a newer version.
The Dataverse software has been tested with Solr version 9.7.0. Future releases in the 9.x series are likely to be compatible. Please get in touch (:ref:`support`) if you are having trouble with a newer version.

Installing Solr
===============
Expand All @@ -178,19 +178,19 @@ Become the ``solr`` user and then download and configure Solr::

su - solr
cd /usr/local/solr
wget https://archive.apache.org/dist/solr/solr/9.4.1/solr-9.4.1.tgz
tar xvzf solr-9.4.1.tgz
cd solr-9.4.1
wget https://archive.apache.org/dist/solr/solr/9.7.0/solr-9.7.0.tgz
tar xvzf solr-9.7.0.tgz
cd solr-9.7.0
cp -r server/solr/configsets/_default server/solr/collection1

You should already have a "dvinstall.zip" file that you downloaded from https://github.com/IQSS/dataverse/releases . Unzip it into ``/tmp``. Then copy the files into place::

cp /tmp/dvinstall/schema*.xml /usr/local/solr/solr-9.4.1/server/solr/collection1/conf
cp /tmp/dvinstall/solrconfig.xml /usr/local/solr/solr-9.4.1/server/solr/collection1/conf
cp /tmp/dvinstall/schema*.xml /usr/local/solr/solr-9.7.0/server/solr/collection1/conf
cp /tmp/dvinstall/solrconfig.xml /usr/local/solr/solr-9.7.0/server/solr/collection1/conf

Note: The Dataverse Project team has customized Solr to boost results that come from certain indexed elements inside the Dataverse installation, for example prioritizing results from Dataverse collections over Datasets. If you would like to remove this, edit your ``solrconfig.xml`` and remove the ``<str name="qf">`` element and its contents. If you have ideas about how this boosting could be improved, feel free to contact us through our Google Group https://groups.google.com/forum/#!forum/dataverse-dev .

A Dataverse installation requires a change to the ``jetty.xml`` file that ships with Solr. Edit ``/usr/local/solr/solr-9.4.1/server/etc/jetty.xml`` , increasing ``requestHeaderSize`` from ``8192`` to ``102400``
A Dataverse installation requires a change to the ``jetty.xml`` file that ships with Solr. Edit ``/usr/local/solr/solr-9.7.0/server/etc/jetty.xml`` , increasing ``requestHeaderSize`` from ``8192`` to ``102400``

Solr will warn about needing to increase the number of file descriptors and max processes in a production environment but will still run with defaults. We have increased these values to the recommended levels by adding ulimit -n 65000 to the init script, and the following to ``/etc/security/limits.conf``::

Expand All @@ -209,7 +209,7 @@ Solr launches asynchronously and attempts to use the ``lsof`` binary to watch fo

Finally, you need to tell Solr to create the core "collection1" on startup::

echo "name=collection1" > /usr/local/solr/solr-9.4.1/server/solr/collection1/core.properties
echo "name=collection1" > /usr/local/solr/solr-9.7.0/server/solr/collection1/core.properties

Dataverse collection ("dataverse") page uses Solr very heavily. On a busy instance this may cause the search engine to become the performance bottleneck, making these pages take increasingly longer to load, potentially affecting the overall performance of the application and/or causing Solr itself to crash. If this is observed on your instance, we recommend uncommenting the following lines in the ``<circuitBreaker ...>`` section of the ``solrconfig.xml`` file::

Expand Down
2 changes: 1 addition & 1 deletion docker/compose/demo/compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -134,7 +134,7 @@ services:
solr:
container_name: "solr"
hostname: "solr"
image: solr:9.4.1
image: solr:9.7.0
depends_on:
- solr_initializer
restart: on-failure
Expand Down
2 changes: 1 addition & 1 deletion modules/dataverse-parent/pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -150,7 +150,7 @@
<!-- Major system components and dependencies -->
<payara.version>6.2024.6</payara.version>
<postgresql.version>42.7.4</postgresql.version>
<solr.version>9.4.1</solr.version>
<solr.version>9.7.0</solr.version>
<aws.version>1.12.748</aws.version>
<google.library.version>26.30.0</google.library.version>

Expand Down
22 changes: 10 additions & 12 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -29,8 +29,8 @@
<reload4j.version>1.2.18.4</reload4j.version>
<flyway.version>10.19.0</flyway.version>
<jhove.version>1.20.1</jhove.version>
<poi.version>5.2.1</poi.version>
<tika.version>2.9.1</tika.version>
<poi.version>5.2.5</poi.version>
<tika.version>2.9.2</tika.version>
<netcdf.version>5.5.3</netcdf.version>

<openapi.infoTitle>Dataverse API</openapi.infoTitle>
Expand Down Expand Up @@ -147,12 +147,12 @@
<dependency>
<groupId>com.apicatalog</groupId>
<artifactId>titanium-json-ld</artifactId>
<version>1.3.2</version>
<version>1.4.0</version>
</dependency>
<dependency>
<groupId>com.google.code.gson</groupId>
<artifactId>gson</artifactId>
<version>2.8.9</version>
<version>2.9.1</version>
<scope>compile</scope>
</dependency>
<!-- Should be refactored and moved to transitive section above once on Java EE 8 (makes WAR smaller) -->
Expand All @@ -168,11 +168,9 @@
<scope>provided</scope>
</dependency>
<dependency>
<!-- There are later versions of this lib available at jitpack.io,
but it seemed better to not add another repo. -->
<groupId>org.everit.json</groupId>
<artifactId>org.everit.json.schema</artifactId>
<version>1.5.1</version>
<groupId>com.github.erosb</groupId>
<artifactId>everit-json-schema</artifactId>
<version>1.14.1</version>
</dependency>
<dependency>
<groupId>org.mindrot</groupId>
Expand Down Expand Up @@ -335,7 +333,7 @@
<dependency>
<groupId>org.apache.solr</groupId>
<artifactId>solr-solrj</artifactId>
<version>9.4.1</version>
<version>9.7.0</version>
</dependency>
<dependency>
<groupId>colt</groupId>
Expand Down Expand Up @@ -406,7 +404,7 @@
<dependency>
<groupId>com.github.jai-imageio</groupId>
<artifactId>jai-imageio-core</artifactId>
<version>1.3.1</version>
<version>1.4.0</version>
</dependency>
<dependency>
<groupId>org.ocpsoft.rewrite</groupId>
Expand Down Expand Up @@ -490,7 +488,7 @@
<dependency>
<groupId>com.google.auto.service</groupId>
<artifactId>auto-service</artifactId>
<version>1.0-rc2</version>
<version>1.1.1</version>
<optional>true</optional>
<type>jar</type>
</dependency>
Expand Down
Loading