Usage Statistics

How to Setup a Statistics Instance

Once Fuse has been successfully installed instead of providing it with a zip file package containing your schemas use PUT /admin/instance to send a JSON document containing the keyword statistics and the host name of a node in your Fuse cluster (see Setup a Fuse Statistics Instance). The interval parameter is optional (default 300 seconds) and specifies how often data should be imported. Once this setup is done you can start the instance (POST /admin/instance) which will then automatically and continuously import and process data. If you want to change part of the configuration (the interval, the sources or both) the request can be repeated at a later time without impact on the data that has been imported up to that point.

The Metrics_fetcher Service

The metrics_fetcher service is part of every Fuse instance. Its purpose is to continuously collect metrics data about the instance (cpu usage, memory usage etc.). The service wont be started by default, it has to be started manually at least once before it will begin collecting data. This data is then meant to be imported by a statistics instance through which it will be accessible to the user. It will also be expired after a set amount of time (default 7 days) and should be imported into a statistics instance before that.

To setup the metrics_fetcher service including the fetch and expiration intervals see Start metrics_fetcher Service and Deactivate metrics_fetcher Service.

How Data is imported

A statistics instance has a special statistics service that is responsible for continuously importing statistics data from one or more sources. It runs automatically with the instance but can be manually started or stopped if the need arises (see Start, Stop or Restart a Service). During import it will call each of the provided sources and retrieve any new data that is available. The data is then processed and sent to the update task of the statistics instance for import. This is repeated every x seconds set by the interval parameter (default 300 seconds).

Note

For efficiency reasons the imported monitoring data is divided into more volatile data (cpu/memory usage) where every recorded item is imported and more stable data like index statistics that are only imported if there are at least 60 minutes time difference between two recorded items (see What Data is available for all available data)

What Data is available

The statistics instance imports three kinds of data:

history data
information about conducted searches and viewed items
error data
details about errors that happened while the instance was running
monitoring data
server monitoring data (cpu usage, harddisk usage etc.) as well as information about indexes and content data

The imported data is divided into the following content types:

expression string always Search expression.
created datetime always Date and time of the search.
results integer always Number of results the search turned up.
source object always name and url of source instance.

view
fuse_id string always Fuse ID of the item in the parent instance.
created datetime always Date and time when the item was viewed.
url string always URL of the item in the parent instance.
source object always name and url of source instance.

error
level string always Logging level, for error items always ERROR.
time datetime always Date and time of the error.
error string always Name of the error.
message string always Optional error message.
traceback string always Traceback to where and why the error occurred.
service string always Instance service where the error occurred.
source object always name and url of source instance.

instance_monitoring
cpu float always Current combined cpu usage of the instance.
created datetime always Date and time when the data was created.
source object always name and url of source instance.

content_stats
size_byte integer always Physical size of the database in bytes.
created datetime always Date and time when the data was created.
source object always name and url of source instance.

index_stats_global
total_document_count integer always Total number of unique document ids in the indexes.
total_document_references integer always Total number of document references in all indexes.
size_byte integer always Physical size of the index in bytes.
total_term_count integer always Total number of terms indexed.
total integer always Number of indexes in the instance.
created datetime always Date and time when the data was created.
source object always name and url of source instance.

service_monitoring
service string always Service name.
cpu float always Current cpu usage of the service.
memory_byte integer always Physical size of the index in bytes.
age float always Seconds since service has been started.
status string always Status of the service (active or stopped).
created datetime always Date and time when the data was created.
source object always name and url of source instance.

index_stats
terms integer always Number of index terms in the index.
docs integer always Number of unique content ids in the index.
refs integer always Number of document references in the index.
index string always Index name.
created datetime always Date and time when the data was created.
source object always name and url of source instance.

The indexed data provides the following facets:

date
creation date of an entry
source_name
name of the data source
source_url
url of the data source
type
holds the available content types
user
user ids
facet_count
how many facets were used in a search
facet
names of facets that were used in a search
query
the search expression
search_time
time needed to generated the results in miliseconds, rounded to two decimal places
created_hour
the hour of day when the item was created
view
content keys of the viewed items
service
name of the instance service
error
error names
status
status of the instance/services
cpu_usage
cpu usage of the instance/services (percent)
memory_usage
used memory of the instance/services (percent)
active_connections
number of connections to the api server
error_service
service names where the error occurred
index_name
index names
index_terms
number of index terms
index_refs
number of document references
index_docs
number of unique document ids in the index