alfresco tuning part2

17
Tech Talk Live Alfresco Performance Tuning – Part 2

Upload: luis-cabaceira

Post on 10-Jan-2017

2.218 views

Category:

Technology


2 download

TRANSCRIPT

Page 1: Alfresco tuning part2

Tech Talk LiveAlfresco Performance Tuning – Part 2

Page 2: Alfresco tuning part2

Speaker BioLuis Cabaceira – Principal Consultant at Alfresco

Page 3: Alfresco tuning part2

Agenda1 – Jvm tuning2 – Garbage collection analisis2 – Caches3 - Alfresco is running slow.. where to start ?

Page 4: Alfresco tuning part2

1 – JVM Tuning• Tune the memory and garbage collection parameters for the JVM to be

appropriate for your situation. Enable GC logs and analyze them.• Solr is more memory intensive than Alfresco • Alfresco consumes memory on the repository L2 Cache, Alfresco system memory.• Tuning will vary depending if you are running Alfresco and Solr on the same

server and same Jvm.• General good settings for Alfresco (assuming a server with 16GB RAM)

-Xms8000m –Xmx12000m -XX:MaxPermSize=512m -Xss1024K -XX:-DisableExplicitGC -XX:NewSize=2G -XX:+UseCodeCacheFlushing -Dsun.security.ssl.allowUnsafeRenegotiation=true -Djava.awt.headless=true

• Extra Settings found on large Alfresco implementations (solr better with CMS)-XX:+UseConcMarkSweepGC –server-XX:+CMSIncrementalMode -XX:CMSInitiatingOccupancyFraction=80-XX:+UseParNewGC -XX:ParallelGCThreads=4-XX:+UseCompressedOops -XX:+CMSClassUnloadingEnabled

Page 5: Alfresco tuning part2

1 – Garbage Collector TuningA regular analysis to the garbage collection logs is also a known best practice and the health of the Garbage collection engine is normally related with the overall effectiveness of memory usage across the system. This is valid for Alfresco, Solr and any possible client that is part of the deployment.

The best practice is to choose an analysis timeframe which is know to be the period when the system is most heavily used and monitor the garbage collection operations that happened during that period.

There are some available tools to analyze garbage collection logs, but the one I think generates a more accurate report is Censum from Jclarity. It’s possible to download a trial version of this tool as use it to analyze the GC logs during 7 days.You can also GCViewer, its opensource and a very useful tool.https://github.com/chewiebug/GCViewer

Page 6: Alfresco tuning part2

1 – Garbage Collector common problems

1 - Look for periodic calls to system.gc(); - add the -XX:+DisableExplicitGC flag

2 - Look for high pauses

High pauses from garbage collection can be an indication of a number of problems. A High Percentage Of Time spent paused in GC may mean that the heap has been under-sized, causing frequent full GC activity. A high Longest Pause in Seconds may be an indication that the heap is too large, causing long individual garbage collections.

3 - Look for premature promotion of objectsPremature promotion is a condition that occurs when objects that should be collected in a young generation pool (Eden or The Survivor "From" space) are instead promoted to Tenured (Old) space. A consequence of premature promotion is that this places additional pressure on Tenured space, which will result in more frequent collections of Tenured. More frequent collections in Tenured collector will interfere with your application's performance.Look for premature

Page 7: Alfresco tuning part2

2 – Caches (ehcache/hazzlecast)

• Alfresco now uses hazelecast clustering and caching • Database is now used for cluster discovery• Removing a node from the cluster is now configured on the alfresco-

global.properties • alfresco.cluster.enabled=false

• The repository caches are separated in 2 different levels:• L1 = The transactional cache (TransactionalCache.java)• L2 = Hazelcast distributed Cache (>4.2.X)• The level 1 cache commits to L2 cache.

• Tracing cache usage is very important for tuning• Adding the following options to your JVM will expose the jmx features of

hazelcast.• -Dhazelcast.jmx=true -Dhazelcast.jmx.detailed=true

Page 8: Alfresco tuning part2

2 – Caches (ehcache/hazzlecast)

• In Alfresco, hazelcast works with factories that allow the creation of caches• You can define your own caches

Page 9: Alfresco tuning part2

2 – Hazzlecast cache mechanisms With Hazelcast the cache is distributed across the clustering members, doing a more linear distribution of the memory usage. In the alfresco implementation you have more mechanisms available to define different cache cluster types.

Fully DistributedThis is the normal value for a hazelcast cache. Cache values (key value pairs) will be evenly distributed across cluster members. Leads to more remote lookups when a get request is issued and that value is present in other node (remote).Local cacheSome caches you may not actually want them to be clustered at all (or distributed), so this option works as a unclustered cache.InvalidatingThis is a local (cluster aware) cache that sets up a messenger that sends invalidation messages messages to the remaining cluster nodes if you updated an item in the cache, much similar as the old eh-cache mechanisms.

Page 10: Alfresco tuning part2

2 – Tuning hazelcast

To perform a cache tuning exercise we need to analyze 3 relevant factors :

- type of data- how often it changes- number of gets compared to the number of writes

If we can identify caches that the correspondent values do not change often, its worth to try and set them to invalidating, and check the performance results.

Note that in distributed-caches, when we have a lot a remote gets, if the objects that are being stored are big, the remote get operation its going to be slow. This is mainly because the object is serialized and it needs to be un-serialized before its content is made available and that operation can take some time depending on the size of the object.

Page 11: Alfresco tuning part2

2 – Tuning hazelcastCaches values can be configured/overridden on alfresco-global.properties

• cache.aclSharedCache.tx.maxItems=40000• cache.aclSharedCache.maxItems=100000• cache.aclSharedCache.timeToLiveSeconds=0• cache.aclSharedCache.maxIdleSeconds=0• cache.aclSharedCache.cluster.type=fully-distributed• cache.aclSharedCache.backup-count=1• cache.aclSharedCache.eviction-policy=LRU• cache.aclSharedCache.eviction-percentage=25• cache.aclSharedCache.merge-policy=hz.ADD_NEW_ENTRY

Look for : WARN [cache.node.nodesTransactionalCache] Transactional update cache ‘org.alfresco.cache.node.nodesTransactionalCache’ is full (125000).

Page 12: Alfresco tuning part2

3 – Alfresco is running Slow (Where to Start)

• First we need to identify what/where is alfresco running Slow

• Is it Alfresco that is slow ?• Page Rendering ? • Dashboard takes a long time to render ?• Login takes long ?• Browsing the Repository is very slow (Permission evaluation ? )• Uploading content performance (bulk import, migration, rules)• Search is slow • Workflow problems• Cpu is 100%• Memory is exhausted• Cluster communication problem ?

Page 13: Alfresco tuning part2

3 – Alfresco is running Slow (Where to Start)

• Investigating, “Follow the Request”

• Is apache or a physical load balancer being used in front on Alfresco ?• Are there enough connections/threads/workers available for the existing

load.• Any timeouts on the apache/lb logs ?• Check overall performance of apache.

• What are the tomcat threads doing • Use support tools, check real time thread dumps, see behaviors/actions.• Run a series of jstack commands and check what the threads are doing

• What is consuming the memory• Extract heapdumps and jstacks and check what is occupying memory

Page 14: Alfresco tuning part2

3 – Alfresco is running Slow (Where to Start)

• “Questioning the DB” – Key performance indicators

• Response time• Blocked queries• Top queries by frequency and / or time• Slow Queries• Average number of Transactions per second (during a peak period)• Number of Connections (during a peak period) • Database server health (Cpu, memory, IO, Network)• Indexes Size and Health

• Inspect JDBC access to the database• Jdbcspy• Log4jdbc • Javamelody

Page 15: Alfresco tuning part2

3 – Alfresco is running Slow (Where to Start)

• “Questioning the Storage” – Key performance indicators

• I/O performance (iometer, hdparm)• Check both

• Alfresco Content Store storage• Solr Indexes Storage(should be faster)

• Run EVT, last test will check the speed of the indexes disk storage and produce a meaningful report.

• Checking Indexes disk free space (merging processes require at least 40% free)

Page 16: Alfresco tuning part2

3 – Alfresco is running Slow (Where to Start)

• “Questioning Alfresco” – Key performance indicators

• Cpu Usage, Memory Usage, Threads• Check alfresco.log for ERRORS, WARN

• Can use elasticSearch to aggregate all relevant logs and do a common search

• Enable Transformations log, check for transformation ERRORS• Verify transformation limits

• Enable GC logs on the alfresco JVM and analise GC performance• Verify Content policies, rules, scheduled tasks, integrations and customizations• Analyze use case, and identify the log classes can can produce relevant

information while in DEBUG mode. Use support tools for real time troubleshooting.

Page 17: Alfresco tuning part2

3 – Alfresco is running Slow (Where to Start)

• “Questioning Solr” – Key performance indicators

• Cpu Usage, Memory Usage, Threads• Check solr.log for ERRORS, WARN

• Can use elasticSearch to aggregate all relevant logs and do a common search

• Enable Query logs, check for ERRORS• Verify Solr statistics / cache usages

• Enable GC logs on the solr JVM and analyze GC performance• Verify merging problems, slow disks, insufficient free space, configuration

problem.• Analyze search use case, blacklist some mime-types, keep your index small,

only index what you will search for.