Our partners are developing custom tools (typically console applications) to import
external data into ADAM all the time. They do a great job, but performance is sometimes
an issue. This blog post lists some important attention points and ways to improve
this performance.
Disable the full-text index
Prevent the full-text index from updating during the import. This would take up
a fair amount of the system's resources, while the index can still easily be
rebuilt afterwards.
To disable the update, run Adam.Core.DatabaseManager.exe DisableFullTextIndex
before the import and run Adam.Core.DatabaseManager.exe EnableFullTextIndex
to re-enable the full-text index after the import.
Note that this can't be done when importing on a live production server, for
obvious reasons.
e.g.:
C:\Program Files\Adam Software\Adam\Adam.Core.DatabaseManager.exe DisableFullTextIndex
-database=Test
Update SQL Server statistics
In case you're importing a large number of records (e.g. more than doubling the
size of your database), the statistics in SQL Server can rapidly get outdated. Therefore,
it's best to run the sp_updatestats stored procedure during the import,
after every few thousand records.
This ensures that the update doesn't gradually slow down during a large import.
This is much less relevant when you're only increasing the number of records by
5% for example.
e.g.:

Keep fragmentation low
Try to keep the fragmentation of your indexes as low as possible. Check for fragmentation
before the import and also a couple of times during the import. By running the
sys.dm_db_index_physical_stats
management view in SQL Server Management Studio you can see the fragmentation level
of all the tables and indexes in your database.
Pay special attention to the
avg_fragmentation_in_percent column: this value
should be as low as possible, and 0 is best.
e.g.: the following query for example returns the fragmentation info for all objects
in YourDatabase:
select * from sys.dm_db_index_physical_stats(DB_ID('YourDatabase'), null, null,
null, null)
When you notice fragmentation, the easiest way to reorganize or rebuild all indexes
is by creating a Maintenance Plan in SQL Server. You can find more info on how to
set this up in the ADAM Admin Guide > Post-installation tasks. In case there
are only one or two indexes to defragment, you can simply right-click the Indexes
node in the Object Explorer and choose Rebuild or Reorganize depending on the level
of fragmentation:

The document Reorganizing
and Rebuilding Indexes explains when to perform a rebuild and when to perform
a reorganization.
Disable Allow Operator Searches
Having many fields with the Allow Operator Searches property enabled can
also cause an import to slow down: each time a record is updated all these indexes
need to be updated as well. It could be faster to disable these indexes before the
import (assuming you don't need to query for these fields during the import)
and re-enabling them afterwards.
This is also not recommended when importing on a live production server.
Run multiple imports concurrently
One of the most expensive operations of an import is preview creation. A preview
of a single record is often (depending on the media engine used) done using a single
thread, so other cores of that server wouldn't have anything to do at that time.
If you write your import tool so that it can import a part of the total import,
you could start several runs of the importer simultaneously and have each perform
their part. And in case you use the ADAM maintenance framework for your importer
this is automatically done for you. All you need to do is start multiple threads
of the maintenance manager.
e.g.: if most of your import time is spent creating previews and you're using
a quad core CPU, then you will get the most out of your machine by starting 4 threads.
The following command starts 4 threads of the RunMaintenanceJobs command:
C:\Program Files\Adam Software\Adam>Adam.Core.CommandLine.exe RunMaintenanceJobs
-threads=4
It's not recommended to start more threads than you have cores, as this may
have a negative impact on performance. You could however use this command on multiple
ADAM servers simultaneously (if your license allows it).
Use different accounts when running concurrently
If you're running multiple imports concurrently, be aware that the StorageUsed
property of the user account executing the import is incremented for each file.
To help preventing locking it's best to run each importer with a separate user
account. This also ensures that you'll never have transactions waiting for the
commit of another transaction which is also trying to increment this StorageUsed
value.
Simplify your SearchExpressions
Custom importers usually contain some queries to check whether specific records
or classifications already exist or not. Typically this is done by searching for
an object and then either trying to load it from the database or by using one of
the ADAM helper classes. It's important to know that all SearchExpressions are
always translated by ADAM into a single SQL query. So, the simpler the search expression,
the simpler the resulting SQL query and the faster it will execute.
By consequence you should try to write your search expressions using ID's instead
of names, identifiers or especially labels, not only for records or classifications,
but also when searching for any object that you can identify using an ID.
e.g.:
the search expression
Classification = '650FBB19-13CE-4794-A7EF-CDAC4EF7DB7B'
could return the exact same result as the following one
Classification.NamePath = '/Products/Dishwashers/Type 123'
but the first will execute much faster than the second.
Use ReadUncommitted
As mentioned before, running multiple imports concurrently is a good way for gaining
performance. But it has a price, because the more imports you run concurrently,
the more chance you have that they'll lock each other during the import as they'll
probably be querying and updating the same information. To prevent this from happening,
run ReadUncommitted queries where possible when querying the database.
e.g.:
the following code tries to get the ID of the record that has a SKU-field with the
value 1234. Since it must perform a database scan, there's a chance that the SQL
query generated by this code will get blocked by other imports that are currently
creating records (because they could be creating a record with SKU value 1234 as
well).
| C# |
1
2
|
RecordHelper helper = new RecordHelper(app);
Guid? id = helper.GetId(new SearchExpression("sku = 1234"));
|
The next code does exactly the same thing, but with two very important differences:
- First, the second argument tells the GetIds-method not to perform a security
check. Since there's a very good chance that you're running your import
with an account that has access to pretty much everything in the database, you might
as well tell ADAM not to perform a security check.
- Second, the ReadUncommitted argument prevents this query from being blocked
by other updates that other importers may be doing.
| C# |
1
2
3
|
RecordHelper helper = new RecordHelper(app);
Guid? id = helper.GetId(new SearchExpression("sku = 1234"),
false, LockMode.ReadUncommitted);
|
Network
When running multiple imports concurrently, make sure your network or IO doesn't
become a bottleneck. Set the Catalog Log Details setting to Verbose, import
a bunch of files using your importer and then check the log to see how long each
media engine actually took to perform a specific action on a file. If you notice
that these operations start to slow down as you run more imports concurrently, you
may be facing an IO or network issue. What could help to get around this, is to
copy each file locally before importing it so that the network is hit only once
for each file and that the IO is more distributed over the different import machines.
These are a few tips to improve your bulk import performance tremendously. Feel
free to post any other suggestions in the comments!