Preparing Cloudera Search to Upgrade to CDH 6
Because Cloudera Search is included in CDH, upgrading CDH upgrades Cloudera Search. If you are upgrading to CDH 6 from CDH 5, and you are using Cloudera Search, you must complete some preparatory work.
Cloudera Search in CDH 6 uses Apache Solr 7, which has some incompatibilities with previous Solr versions. To facilitate the upgrade, Cloudera provides a Solr configuration migration script, solr-upgrade.sh. This script is included with Cloudera Manager 6 and CDH 6.
The following topics describe the steps and procedures to upgrade to CDH 6 if you are using Cloudera Search:
Before You Begin
Before upgrading:
- Stop making changes to your Cloudera Search environment. Make sure that no configuration changes are made to Cloudera Search for the duration of the migration and upgrade.
- Do not create, delete, or modify collections for the duration of the migration and upgrade.
You can continue indexing to existing collections until otherwise instructed.
Solr Configuration Migration Script Overview
Despite widespread enterprise adoption, Solr lacks automated upgrade tooling. It has long been a challenge to understand the implications of a Solr upgrade. Solr admins were required to review the Solr release notes and manually identify configuration changes needed to address incompatibilities or to take advantage of new features. Additionally, admins had to determine whether they could upgrade existing indexes, or if they had to re-index the raw data.
Starting in Cloudera Enterprise 6, Cloudera provides a Solr configuration migration script to simplify the upgrade process by providing upgrade instructions tailored to your configuration. These instructions can help you to answer following questions:
- Does my Solr configuration use any configurations that are incompatible with the new version? If so, which ones?
- For each incompatibility, what do I need to do to address it? Where can I get more information about this incompatibility, and why it was introduced?
- Are there any changes in Lucene or Solr that require me to do a full re-index, or is it sufficient to upgrade the index?
This tool is built using the Extensible Stylesheet Language Transformations engine. The upgrade rules, implemented as XSLT transformations, can identify incompatibilities and in some cases can fix them automatically.
In general, incompatibilities are categorized as follows:
- ERROR: Removal of a Lucene or Solr configuration element (such as a field type) is marked as ERROR in the validation output. These types of incompatibilities typically result in failure to start the Solr service or load the core. To address this, you must manually fix the Solr configuration.
- WARNING: Deprecation of a configuration element in the new Solr version is marked as WARNING in the validation output. In general, these types of incompatibilities do not prevent starting the Solr service or loading cores, but may prevent applications from using new Lucene or Solr features (or bug fixes). You can choose to make changes to Solr configuration using application specific knowledge to fix such incompatibility.
- INFO: Incompatibilities that can be fixed automatically by rewriting the Solr configuration are marked INFO in the validation output. This can include incompatibilities in the underlying Lucene implementation (for example, LUCENE-6058) that would require rebuilding the index instead of upgrading it. Typically, these incompatibilities do not result in failure to start the Solr service or load cores, but may affect query result accuracy or consistency of the underlying indexed data.
Running the Solr Configuration Migration Script

The Solr configuration migration script, solr-upgrade.sh, is included with CDH 6 and Cloudera Manager 6 agent software. This enables you to run the script after upgrading to Cloudera Manager 6, and before upgrading to CDH 6.
The script is located at:
- Cloudera Manager 6: /opt/cloudera/cm/solr-upgrade/solr-upgrade.sh
- CDH 6 Parcels: /opt/cloudera/parcels/CDH/lib/solr/solr-upgrade/solr-upgrade.sh
- CDH 6 Packages: /usr/lib/solr/solr-upgrade/solr-upgrade.sh
When running the script included with Cloudera Manager, you must specify the location of the CDH 5 Solr binaries using the CDH_SOLR_HOME environment variable. If you are using parcels, Solr binaries are located at /opt/cloudera/parcels/CDH/lib/solr. For package installations, the location is /usr/lib/solr.
For example:
export CDH_SOLR_HOME=/opt/cloudera/parcels/CDH/lib/solr
The solr-upgrade.sh command syntax is as follows:
./solr-upgrade.sh help Usage: ./solr-upgrade.sh command [command-arg] Options: --zk zk_ensemble --debug Prints error output of calls --trace Prints executed commands Commands: help download-metadata -d dest_dir validate-metadata -c metadata_dir restore-metadata -c metadata_dir config-upgrade [--dry-run] -c conf_path -t conf_type -u upgrade_processor_conf -d result_dir [-v] Parameters: -c <arg> This parameter specifies the path of Solr configuration to be operated upon. -t <arg> This parameter specifies the type of Solr configuration to be validated and transformed. The currently accepted values are schema, solrconfig and solrxml. -d <arg> This parameter specifies the directory path where the result of the command should be stored. -u <arg> This parameter specifies the path of the Solr upgrade processor configuration. --dry-run This command will perform compatibility checks for the specified Solr configuration. -v This parameter enables printing XSLT compiler warnings on the command output.
Upgrading Cloudera Search to CDH 6

Use the following procedures to migrate your Cloudera Search configuration and upgrade to CDH 6. The provided migration script cannot upgrade the Lucene index files. After upgrading, you must re-index your collections. For more information, see Reindexing in Solr in the Solr wiki. The upgrade process is as follows:
Back Up Solr Configuration and Data
Before upgrading to CDH 6, back up your Solr collections using the following procedure. This allows you to roll back to the pre-upgrade state if any problems occur during the upgrade process.
- If your cluster is using Kerberos authentication, authenticate as the hdfs superuser principal:
kinit hdfs@<EXAMPLE.COM>
Replace <EXAMPLE.COM> with your Kerberos realm name.
- If you have secured ZooKeeper using ACLs, set the ZKCLI_JVM_FLAGS
environment variable as follows:
export ZKCLI_JVM_FLAGS="-Djava.security.auth.login.config=/path/to/jaas.conf \ -DzkACLProvider=org.apache.solr.common.cloud.ConfigAwareSaslZkACLProvider \ -Droot.logger=INFO,console"
- Create directories in HDFS to store the backups. If your cluster does not have security enabled, run the following commands as the hdfs user by adding
sudo -u hdfs before each command:
hdfs dfs -mkdir /solr-backup hdfs dfs -mkdir /solr-backup/zk-backup hdfs dfs -mkdir /solr-backup/hdfs-backup hdfs dfs -mkdir /solr-backup/localfs-backup
- Back up the Solr configuration from ZooKeeper. If your cluster does not have security enabled, add sudo -u hdfs before the hdfs
dfs command, and skip the kinit commands:
- Cloudera Manager:
mkdir $HOME/cdh5-backup export CDH_SOLR_HOME=/opt/cloudera/parcels/CDH/lib/solr kinit solr@EXAMPLE.COM /opt/cloudera/cm/solr-upgrade/solr-upgrade.sh download-metadata -d $HOME/cdh5-backup kinit hdfs@EXAMPLE.COM hdfs dfs -copyFromLocal $HOME/cdh5-backup/* /solr-backup/zk-backup
- Unmanaged:
mkdir $HOME/cdh5-backup export CDH_SOLR_HOME=/usr/lib/solr kinit solr@EXAMPLE.COM /usr/lib/solr/solr-upgrade/solr-upgrade.sh download-metadata -d $HOME/cdh5-backup kinit hdfs@EXAMPLE.COM hdfs dfs -copyFromLocal $HOME/cdh5-backup/* /solr-backup/zk-backup
- Cloudera Manager:
- Back up the Solr configuration metadata from the local filesystem of each host running the Solr service. The directory (/var/lib/solr by default) is specified in Cloudera Manager at
hdfs dfs -mkdir "/solr-backup/localfs-backup/$(hostname)" hdfs dfs -copyFromLocal /var/lib/solr/*_replica* "/solr-backup/localfs-backup/$(hostname)"
. Run
the following commands on each Solr server host. If your cluster does not have security enabled, add sudo -u hdfs before each command:
- Copy the Solr HDFS data directory (/solr by default) to the backup HDFS directory (/solr-backup/hdfs-backup in this
example). If your cluster does not have security enabled, run the following command as the hdfs user by adding sudo -u hdfs before the
command:
hdfs dfs -cp /solr/* /solr-backup/hdfs-backup
Migrating the Configuration
After you have backed up the Solr configuration and data, create a copy of the backup on which to run the migration tool. The tool supports migrating schema.xml (and managed-schema), solrconfig.xml, and solr.xml configuration files.
- Create a working directory as a copy of the backup you created in Back Up Solr Configuration and
Data. This allows you to run the migration script while preserving your original backup:
cp -r $HOME/cdh5-backup $HOME/cdh6-migrated
- Create an output directory for the script:
mkdir $HOME/cdh6-staging
- Migrate the solr.xml file:
- Run the migration script:
- Cloudera Manager:
export CDH_SOLR_HOME=/opt/cloudera/parcels/CDH/lib/solr /opt/cloudera/cm/solr-upgrade/solr-upgrade.sh config-upgrade -t solrxml \ -c $HOME/cdh6-migrated/solr.xml \ -u /opt/cloudera/cm/solr-upgrade/validators/solr_4_to_7_processors.xml \ -d $HOME/cdh6-staging
- Unmanaged:
export CDH_SOLR_HOME=/usr/lib/solr /usr/lib/solr/solr-upgrade/solr-upgrade.sh config-upgrade -t solrxml \ -c $HOME/cdh6-migrated/solr.xml \ -u /usr/lib/solr/solr-upgrade/validators/solr_4_to_7_processors.xml \ -d $HOME/cdh6-staging
- Cloudera Manager:
- If the script reports any incompatibilities, fix them in the working directory ($HOME/cdh6-migrated/solr.xml in this example) and then re-run the script. Repeat until the script outputs no incompatibilities and the solr.xml migration is successful.
- Copy the migrated solr.xml file to the working directory:
cp $HOME/cdh6-staging/solr.xml $HOME/cdh6-migrated/solr.xml
- Run the migration script:
- For each collection configuration set in $HOME/cdh6-migrated/configs/, migrate the configuration and schema:
- Run the migration script for solrconfig.xml and schema.xml (or managed-schema). For
example, for a configuration set named tweets_config:
- Cloudera Manager:
/opt/cloudera/cm/solr-upgrade/solr-upgrade.sh config-upgrade -t solrconfig \ -c $HOME/cdh6-migrated/configs/tweets_config/conf/solrconfig.xml \ -u /opt/cloudera/cm/solr-upgrade/validators/solr_4_to_7_processors.xml \ -d $HOME/cdh6-staging /opt/cloudera/cm/solr-upgrade/solr-upgrade.sh config-upgrade -t schema \ -c $HOME/cdh6-migrated/configs/tweets_config/conf/schema.xml \ -u /opt/cloudera/cm/solr-upgrade/validators/solr_4_to_7_processors.xml \ -d $HOME/cdh6-staging
- Unmanaged:
/usr/lib/solr/solr-upgrade/solr-upgrade.sh config-upgrade -t solrconfig \ -c $HOME/cdh6-migrated/configs/tweets_config/conf/solrconfig.xml \ -u /usr/lib/solr/solr-upgrade/validators/solr_4_to_7_processors.xml \ -d $HOME/cdh6-staging /usr/lib/solr/solr-upgrade/solr-upgrade.sh config-upgrade -t solrconfig \ -c $HOME/cdh6-migrated/configs/community_config/conf/solrconfig.xml \ -u /usr/lib/solr/solr-upgrade/validators/solr_4_to_7_processors.xml \ -d $HOME/cdh6-staging
- Cloudera Manager:
- If the script reports any incompatibilities, fix them in the working directory ($HOME/cdh6-migrated/configs/<configName>/conf/solrconfig.xml in this example) and then re-run the script. Repeat until the script outputs no incompatibilities and the solrconfig.xml migrations are all successful.
- After fixing all incompatibilities, copy the migrated solrconfig.xml and schema.xml (or managed-schema) files to the working directory. For example, for the tweets_config configuration sets:
cp $HOME/cdh6-staging/solrconfig.xml $HOME/cdh6-migrated/configs/tweets_config/solrconfig.xml cp $HOME/cdh6-staging/schema.xml $HOME/cdh6-migrated/configs/tweets_config/schema.xml cp $HOME/cdh6-staging/managed-schema $HOME/cdh6-migrated/configs/tweets_config/managed-schema
- Repeat for all configuration sets in $HOME/cdh6-migrated/configs/.
- Run the migration script for solrconfig.xml and schema.xml (or managed-schema). For
example, for a configuration set named tweets_config:
Validating the Migrated Configuration
The solr-upgrade.sh script includes a validate-metadata command that you can run against the migrated Solr configuration and metadata to make sure that they can be used to re-initialize the Solr service after the upgrade. The script performs a series of checks to make sure that:
- Required configuration files (such as solr.xml, clusterstate.json, and collection configuration sets) are present.
- The configuration files are compatible with the Solr version being upgraded to (Solr 7, in this case).
For example:
./solr-upgrade.sh validate-metadata -c $HOME/cdh6-migrated Validating metadata in /home/jdoe/solr validating solr configuration using config upgrade processor @ ./validators/solr_4_to_7_processors.xml ---------- validating solr.xml ---------- Validating solrxml... No configuration errors found... No configuration warnings found... Solr solrxml validation is successful. ---------- validation successful for solr.xml ---------- ---------- validating configset books_config ---------- Validating solrconfig... No configuration errors found... No configuration warnings found... Solr solrconfig validation is successful. Validating schema... Following configuration errors found: * Legacy field type (name = pint and class = solr.IntField) is removed. * Legacy field type (name = plong and class = solr.LongField) is removed. * Legacy field type (name = pfloat and class = solr.FloatField) is removed. * Legacy field type (name = pdouble and class = solr.DoubleField) is removed. * Legacy field type (name = pdate and class = solr.DateField) is removed. No configuration warnings found... Please note that in Solr 7: * The implicit default Similarity is changed to SchemaSimilarityFactory Solr schema validation failed.
If the validation fails, you can revisit the steps in Migrating the Configuration.
Re-Initializing the Solr Service
After migrating your Solr configuration and validating it, you must re-initialize your Solr service.

- If your cluster is using Kerberos authentication, authenticate as the hdfs principal:
kinit hdfs@<EXAMPLE.COM>
- After confirming that you have backed up your Solr configuration, indexes, and data, delete the Solr HDFS data directory (/solr by default). If your
cluster does not have security enabled, run the command as the hdfs user by adding sudo -u hdfs before the command:
hdfs dfs -rm -r /solr/*
- Delete the /solr znode in ZooKeeper:
- If you have secured ZooKeeper using ACLs, set the ZKCLI_JVM_FLAGS
environment variable and then authenticate as the solr principal as follows:
export ZKCLI_JVM_FLAGS="-Djava.security.auth.login.config=/path/to/jaas.conf \ -DzkACLProvider=org.apache.solr.common.cloud.ConfigAwareSaslZkACLProvider \ -Droot.logger=INFO,console" kinit solr@<EXAMPLE.COM>
- Specify the ZooKeeper quorum using the SOLR_ZK_ENSEMBLE environment variable. For example, if your ZooKeeper quorum consists of three hosts, zk01, zk02, and zk03:
export SOLR_ZK_ENSEMBLE=zk01.example.com:2181,zk02.example.com:2181,zk03.example.com:2181/solr
- Clear the /solr znode:
${CDH_SOLR_HOME}/bin/zkcli.sh -zkhost ${SOLR_ZK_ENSEMBLE} -cmd clear /
- If you have secured ZooKeeper using ACLs, set the ZKCLI_JVM_FLAGS
environment variable and then authenticate as the solr principal as follows:
- After confirming that you have backups of the /var/lib/solr/*_replica* files on the local filesystem of each Solr server host, delete everything within
/var/lib/solr/ (but not the directory itself):
sudo -u solr rm -rf /var/lib/solr/*
Repeat this step on all cluster hosts running the Solr service.
Upgrade to CDH 6
After completing all of these procedures, upgrade to CDH following the regular process as documented in Upgrading CDH and Managed Services Using Cloudera Manager or Upgrading Unmanaged CDH Using the Command Line. After the upgrade is complete, continue to Restoring Your Configuration and Re-Indexing Collections After Upgrading to CDH 6.
Restoring Your Configuration and Re-Indexing Collections After Upgrading to CDH 6
After upgrading to CDH 6, your Solr service in Cloudera Manager will report bad health due to missing configuration files. To restore your configuration and re-create your collections:
- Shut down the Solr service:
- Cloudera Manager:
- Unmanaged: sudo service solr-server stop
- Use the solr-upgrade.sh script to upload the migrated configurations to ZooKeeper:
- If you have secured ZooKeeper using ACLs, set the ZKCLI_JVM_FLAGS
environment variable and then authenticate as the solr principal as follows:
export ZKCLI_JVM_FLAGS="-Djava.security.auth.login.config=/path/to/jaas.conf \ -DzkACLProvider=org.apache.solr.common.cloud.ConfigAwareSaslZkACLProvider \ -Droot.logger=INFO,console" kinit solr@<EXAMPLE.COM>
- Run the solr-upgrade.sh script as follows:
- Cloudera Manager:
export CDH_SOLR_HOME=/opt/cloudera/parcels/CDH/lib/solr /opt/cloudera/cm/solr-upgrade/solr-upgrade.sh bootstrap-config \ -c $HOME/cdh6-migrated
- Unmanaged:
export CDH_SOLR_HOME=/usr/lib/solr /usr/lib/solr/solr-upgrade/solr-upgrade.sh bootstrap-config \ -c $HOME/cdh6-migrated
- Cloudera Manager:
- If you have secured ZooKeeper using ACLs, set the ZKCLI_JVM_FLAGS
environment variable and then authenticate as the solr principal as follows:
- Start the Solr service:
- Cloudera Manager:
- Unmanaged: sudo service solr-server start
- Recreate your collections.
<< Migrating Sentry Privileges for Solr After Upgrading to CDH 6 | ©2016 Cloudera, Inc. All rights reserved | Upgrading to Oracle JDK 1.8 >> |
Terms and Conditions Privacy Policy |