Oracle Commerce: Content Acquisition System Installation Guide Version 11.1 - July 2014
Oracle Commerce: Content Acquisition System Installation Guide Version 11.1 - July 2014
Oracle Commerce: Content Acquisition System Installation Guide Version 11.1 - July 2014
Contents
Preface........................................................................................................................................7
About this guide..........................................................................................................................................................7
Who should use this guide.........................................................................................................................................7
Conventions used in this guide..................................................................................................................................7
Contacting Oracle Support.........................................................................................................................................8
iii
For information about Oracle's commitment to accessibility, visit the Oracle Accessibility Program website at
http://www.oracle.com/pls/topic/lookup?ctx=acc&id=docacc.
Oracle customers have access to electronic support through My Oracle Support. For information, visit
http://www.oracle.com/pls/topic/lookup?ctx=acc&id=info or visit
http://www.oracle.com/pls/topic/lookup?ctx=acc&id=trs if you are hearing impaired.
vi
Oracle Commerce
Preface
Oracle Commerce Guided Search is the most effective way for your customers to dynamically explore your
storefront and find relevant and desired items quickly. An industry-leading faceted search and Guided Navigation
solution, Guided Search enables businesses to influence customers in each step of their search experience.
At the core of Guided Search is the MDEX Engine, a hybrid search-analytical database specifically designed
for high-performance exploration and discovery. The Oracle Commerce Content Acquisition System provides
a set of extensible mechanisms to bring both structured data and unstructured content into the MDEX Engine
from a variety of source systems. The Oracle Commerce Assembler dynamically assembles content from any
resource and seamlessly combines it into results that can be rendered for display.
Oracle Commerce Experience Manager enables non-technical users to create, manage, and deliver targeted,
relevant content to customers. With Experience Manager, you can combine unlimited variations of virtual
product and customer data into personalized assortments of relevant products, promotions, and other content
and display it to buyers in response to any search or facet refinement. Out-of-the-box templates and experience
cartridges are provided for the most common use cases; technical teams can also use a software developer's
kit to create custom cartridges.
| Preface
Oracle Commerce
Chapter 1
System requirements
See the Oracle Commerce Supported Environments Matrix document in the My Oracle Support knowledge
base at https://support.oracle.com/ for information on supported operating systems and Web browsers.
Hard disk capacity must be sufficient to store the records written to the Record Store or to record output files.
Please contact your Oracle representative if you need more information on hardware sizing.
Minimum hardware requirements:
x64 processor, minimum 1.8 GHz
2 GB of RAM
At least an 80 GB hard drive, depending on the size of your application data set
10
For a list of supported file formats, see "Appendix B File Formats Supported by the CAS Document Conversion
Module" in the Endeca CAS Developer's Guide.
Recommended reading
Before installing, Oracle recommends that you read the following documents for important information about
the release.
Getting Started Guide
The Oracle Endeca Commerce Getting Started Guide gives an overview of Oracle Endeca components and
includes information about configuration scenarios. After installing all the components in your deployment,
read this guide for information on verifying your installation. You can download the Oracle Endeca Commerce
Getting Started Guide from the Oracle Technology Network.
Release Notes
Refer to the release notes for information about new features, changed features, and bug fixes for this release.
The release notes (README.txt) are part of the CAS documentation download. After installation, release
notes are also available in the following location:
Windows: CAS\<version>
UNIX: CAS/<version>
Migration Guide
Refer to the Endeca CAS Migration Guide for information about migrating your implementation from a previous
version of Endeca software. You can download the Endeca CAS Migration Guide from the Oracle Technology
Network.
Oracle Commerce
Chapter 2
Installing on Windows
This section provides instructions for installing the Endeca Content Acquisition System on Windows.
12
6. In the Destination Folder screen, select an installation location or accept the default location of
C:\Endeca\CAS and then click Next.
7. In the Endeca CAS Service Information screen, specify the user name, password, and domain information
for the user who will run the CAS Service and then click Next. (This is typically the endeca user you created
in the previous procedure. )
8. In the CAS Server Information screen, enter the CAS Server port and CAS Server shutdown port, or
accept the default values of 8500 and 8506.
9. In the Completing the Setup Wizard screen, click Next.
The CAS Service starts automatically after installation.
Oracle Commerce
13
Related Links
Installing CAS silently on Windows on page 13
The silent installer is useful if you want to add the CAS installation to an install script, or push out the
installation on multiple machines. The silent installer is not interactive.
Updating the Deployment Template to use the WSDL client stubs and the CAS Deployment Template
component on page 14
This task is optional. It may be necessary if you did not integrate the CAS Deployment Template
component into the Deployment Template during the installation process and then later found that
you need the Deployment Template to manage crawling operations.
Creating a user for the Endeca services on Windows on page 11
You must run the Endeca services as a specified user, for which you can control permissions.
Description
OCcas-<version>-arch-OS.exe
/s
/l
CASHOST
Optional. Specifies the host name of the machine where you want
to install and run CAS. If the omitted, the default value is local
host.
CASPORT
CASSTOPPORT
Optional. Specifies the port to stop the CAS Service. If the omitted,
the default value is 8506.
TARGETDIR
CASSELECTED
CONSOLESELECTED
Oracle Commerce
14
Option
Description
Console specify FALSE for this option. If omitted, the default value
is TRUE.
DTSELECTED
SAMPLESSELECTED
USERNAME
Required. Species the user running the Endeca CAS Service. This
is typically the specified as the endeca user.
PASSWORD
DOMAINNAME
Updating the Deployment Template to use the WSDL client stubs and the CAS
Deployment Template component
This task is optional. It may be necessary if you did not integrate the CAS Deployment Template component
into the Deployment Template during the installation process and then later found that you need the Deployment
Template to manage crawling operations.
To update the Deployment Template:
Copy the new CAS Deployment Template component into the Deployment Template:
Both the WSDL client stubs and the CAS Deployment Template component (the ContentAcquisition
ServerComponent class) are packaged in casStubs.jar.
Oracle Commerce
15
For details on upgrading applications deployed for a previous version of CAS, see the Endeca CAS Migration
Guide.
Installing on UNIX
This section provides instructions for installing the Endeca Content Acquisition System on UNIX.
16
If you chose to install the CAS Console, you must restart the Endeca Tools Service. See the Oracle Endeca
Tools and Frameworks Installation Guide.
Related Links
Installing CAS silently on UNIX on page 17
The silent installer is useful if you want to add the CAS installation to your own install script, or push
out the installation on multiple machines.
Updating the Deployment Template to use the WSDL client stubs and the CAS Deployment Template
component on page 14
This task is optional. It may be necessary if you did not integrate the CAS Deployment Template
component into the Deployment Template during the installation process and then later found that
you need the Deployment Template to manage crawling operations.
Oracle Commerce
17
./OCcas-version-arch-OS.sh
silent.txt
If $ENDECA_TOOLS_ROOT and $ENDECA_TOOLS_CONF are not set in the environment or you want to
override their values, specify their respective flags:
Oracle Commerce
18
To skip installation of the CAS Console, you must specify the --skip_console_installation flag:
./OCcas-version-arch-OS.sh --silent
--target /usr/local < silent.txt --skip_console_installation
To skip CAS integration with the Deployment Template, you must specify the --skip_dt_integration
flag:
./OCcas-version-arch-OS.sh --silent
--target /usr/local < silent.txt --skip_dt_integration
Use this flag if the CAS environment does not require the Deployment Template to manage crawling
operations.
To install samples of CAS code and configuration files, you must specify the --install_samples
flag:
./OCcas-version-arch-OS.sh --silent
--target /usr/local < silent.txt --install_samples
Following installation:
To start the CAS Service, navigate to CAS/<version>/bin and run the following command:
cas-service.sh
If you chose to install the CAS Console, you must restart the Endeca Tools Service. See the Oracle Endeca
Workbench Installation Guide.
Oracle Commerce
19
The Content Acquisition System detects each plug-in and validates the extensions within it by checking the
uniqueness of extension IDs and by checking for the presence of an annotation of either @CasDataSource
or @CasManipulator for each extension.
To install a plug-in into CAS:
1. Stop Endeca CAS Service.
2. Navigate to <install path>\CAS\<version>\lib\cas-server-plugins and create a
plugin-name subdirectory for each plug-in.
For example: CAS\<version>\lib\cas-server-plugins\JDBCDataSourceExt
3. Copy the plug-in JAR or JARs, and any dependent JAR files, to <install
path>CAS\<version>\lib\cas-server-plugins\plugin-name .
4. Repeat the steps above as necessary for multiple plug-ins.
5. Start Endeca CAS Service.
You can confirm that an extension is installed by runing the listModules task of the CAS Server Command-line
Utility and specifying a moduleType of either SOURCE or MANIPULATOR. The task returns the installed modules.
For example, this task shows that a custom data source named Sample Data Source for testing is
installed:
C:\Endeca\CAS\<version>\bin>cas-cmd listModules -t SOURCE
Sample Data Source
*Id: Sample Data Source
*Type: SOURCE
*Description: Sample Data Source for testing
File System
*Id: File System
*Type: SOURCE
*Description: No description available for File System
*Capabilities:
*Binary Content Accessible via FileSystem
*Data Source Filter
*Has Binary Content
*Expand Archives
Oracle Commerce
20
Installing the Content Acquisition System | Package contents and directory structure
...
workspace
The contents of the CAS directory are described here in detail.
Directory
Contents
version\bin
version\console
version\doc
version\doc\wsdl
The Web Service (WSDL) files for the CAS Server, the Component
Instance Manager, and the Record Store.
version\java
The JDK used to run the CAS components (except CAS Console, which
runs in the Endeca Tools Service).
version\lib
Oracle Commerce
Libraries for the CAS command-line utilities including: the CAS Server
utility, the Component Instance Manager utility, the Record Store
utility.
Libraries for the CAS APIs including: the CAS Server API, the
Component Instance Manager API, the Record Store API, and the
CAS Extension API.
Installing the Content Acquisition System | Package contents and directory structure
Directory
Contents
version\lib\cas-dt
21
version\lib\cas-server-plugins Libraries for CAS plug-ins including CMS connectors and custom
extensions (if applicable).
version\lib\oit-sx
version\lib\recordstore
-forge-adapter
version\lib\web-crawler
version\sample
version\webapps
The root.war file, which is the CAS Server and Component Instance
Manager applications.
version\workspace_template The template for the workspace directory that contains configuration
files.
workspace
workspace\conf
The working directory for the CAS Server and the Web Crawler.
The commandline.properties file, which contains the CAS
Service settings necessary for the CAS command-line utilities to run.
Three logging configuration files
(cas-service.log4j.properties for the CAS Service,
recordstore-cmd.log4j.properties for the Record Store,
and cas-cmd.log4j.properties for the Command-line Utility).
The Jetty configuration files.
workspace\conf\webcrawler\default
The default configuration files for the Web Crawler, including the
log4j.properties logging configuration file.
workspace\conf\webcrawler\non-polite-crawl
Sample crawl configuration files for non-polite crawls. As with the polite
version, the settings in these files will override the default settings.
workspace\conf\
web-crawler\polite-crawl
workspace\logs
The cas-service.log file, which contains the CAS Service log output,
and includes log messages from all crawls managed by the CAS Server.
workspace\output
Default destination directory for the crawl output from the Web Crawler.
The output directory is not present upon installation. It is created when
the Web Crawler writes to output records for a crawl.
workspace\state
State files for the CAS Service components. State files can include
Record Store instances, state directories for data source extension
information, and state directories for manipulator extension information.
Oracle Commerce
22
Installing the Content Acquisition System | About changing the role used for the CAS Console extension
Note: There is no logs directory for the Web Crawler, because by default the Web Crawler sends its
standard output to the console. However, you can modify the log4j.properties file to send the output
to a file.
About changing the role used for the CAS Console extension
By default, only users with administrative rights can view the CAS Console extension in Oracle Workbench.
However, you can also make this extension visible to non-administrative users.
See the Oracle Endeca Workbench Administrator's Guide for details on changing the visibility of the CAS
Console extension for different user roles.
Note: If you change a user role in Workbench, you must manually remove the CAS Console extension
if you choose to unregister it.
Related Links
Uninstalling CAS Console if its extension configuration was changed on page 24
If you need to uninstall the CAS Console extension for Oracle Endeca Workbench and you have
manually edited its extension configuration (for example to assign the extension to a role other than
"admin"), you must manually uninstall the CAS Console as an Oracle Endeca Workbench extension.
Oracle Commerce
Chapter 3
24
Uninstalling the Content Acquisition System | Uninstalling CAS Console if its extension configuration was
changed
25
Related Links
About changing the role used for the CAS Console extension on page 22
By default, only users with administrative rights can view the CAS Console extension in Oracle
Workbench. However, you can also make this extension visible to non-administrative users.
Oracle Commerce
Index
B
bin directory
contents of 20
location of 20
package
contents of 19
directory structure of 19
overview of 9
prerequisites 9
C
CAS Server API
location of 20
CAS Service
starting from inittab 16
cas-server.log4j.properties file 21
configuration files
for logging 21
for non-polite crawls 21
for polite crawls 21
uninstalling
a plug-in 24
CAS Console if extension configuration changed 24
CAS Console on UNIX 23
on UNIX 24
on Windows 23
Deployment Template
upgrading to use WSDL stub files 14
I
installing
CAS Console on UNIX 16
on UNIX 15
on Windows 12
silently on UNIX 17
silently on Windows 13
O
overview of package 9
recordstore-server.log4j.properties file 21
S
system requirements 9
W
workspace directory
location of 21
WSDL files
location of 20
WSDL stub files
using with Deployment Template 14