aem - key learning from escalations
TRANSCRIPT
© 2017 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
AEM | Key Leanings from EscalationsKanika Gera
© 2017 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
What will you Learn?
• Mixed Bag• Mongo Facts• Mongo Issues & Solutions• AEM Target Facts
© 2017 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Mixed Bag
© 2017 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Nitty Gritty Details
4
• If the java.io.tmpdir size keeps increasing and fills up the disk, the asset processing/upload will either fail or will skip process steps and run forever.
• Log Messages Like this mean the connectivity between S3 and AEM is not stable:
28.03.2017 13:41:58.059 *INFO* [oak-lucene-22] com.amazonaws.http.AmazonHttpClient Unable to execute HTTP request: acc-dsg-prod-aemcluster.s3.amazonaws.com failed to respondorg.apache.http.NoHttpResponseException: acc-dsg-prod-aemcluster.s3.amazonaws.com failed to respondat org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:143)at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57)at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:260)
• Sling Topology Change Event causes Job Queues to get stuck, thread like below:
at org.apache.sling.event.impl.jobs.tasks.CheckTopologyTask.assignJobs(CheckTopologyTask.java:261)at org.apache.sling.event.impl.jobs.tasks.CheckTopologyTask.assignUnassignedJobs(CheckTopologyTask.java:222)
Don’t make changes like modifying Sync Agents, Removing/Adding AEM node, restarting AEM, etc which trigger topology change.
© 2017 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Nitty Gritty Details
5
• The BlockrepositoryWrites() is unsupported: javax.jcr.UnsupportedRepositoryOperationException
at com.day.crx.sling.server.impl.jmx.ManagedRepository.blockRepositoryWrites()
Don't do code deployments during backup. If you have a datastore then make sure the back it up after the crx-quickstart folder. When you restore from backup you should exclude the repository/index folder as it is a cache and is not hot backup safe like the tar files.
• Addition of AEM node to authoring cluster triggers a repository scan with Oak core 1.2.x/1.4.x with Mongo replica set with 3 instances: NPR-16423
• Observation Queue Working: Observation allows listeners to register for changes happening at a set of paths and also declare if those changes were done locally or externally. Each listener registered with Oak gets its own observation queue - this queue holds a reference of commit (revision of commit in case of MongoDB) and when the even is delivered to the listener Oak provides a diff related to this commit.There are 2 sources of event references to these observation queue - local commits and changes done in other cluster nodes .
© 2017 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Mongo Treasures
© 2017 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Mongo Treasures:
7
• Configure Mongo Logs https://docs.mongodb.com/v3.2/reference/log-messages/#verbosity-levels
• Ideal Mongo Parameters:
mongouri="mongodb://aemuser:[email protected]:27017,accdsg-lnx-mdb20.rrd.com:27017/aem-author?authsource\=aem-author&authMechanism\=MONGODB-CR&replicaSet\=aem;readPreference\=nearest;w\=2”
-Doak.queryLimitInMemory=300000 -Doak.queryLimitReads=300000 -Doak.fastQuerySize=true -Dupdate.limit=500000 -Doak.indexUpdate.failOnMissingIndexProvider=true -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=${JAVA_IO_TMP_DIR} -XX:NewRatio=1 -XX:MaxMetaspaceSize=2048m -XX:+UseG1GC -Djackrabbit.maxQueuedEvents=1000000 -Xloggc:/mnt/crx/author/crx-quickstart/logs/gc.log -XX:+PrintGCDetails -verbosegc -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+PrintHeapAtGC -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime -XX:+PrintReferenceGC -XX:+PrintAdaptiveSizePolicy -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=50M -Doak.documentMK.maxServerTimeDiffMillis=31000
• AEM Oak 1.4.x MongoDB cluster is NOT supported across geographical regions. All Mongo and AEM instances should be set up in the same datacenter.
© 2017 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Mongo Treasures Persistence Cache Disabling for Some nodespersistentCache="/path/to/crx-quickstart/repository/cache,size\=4096,binary\=0,-nodes,-children”
Set system property -Doak.disableJournalDiff=true. This should be removed later when indexing catches up to current state when indexing updates are slow.
Disable Mongo GC:change the value from crx to crx3-disabled in /libs/granite/operations/config/maintenance/granite:weekly/granite:MongoDataStoreGarbageCollectionTask
© 2017 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Mongo Issues & Solutions
© 2017 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Mongo Issues & Solutions Caused by: com.mongodb.MongoExecutionTimeoutException: operation
exceeded time limit” happen due to long running mongo queries.Tune -Doak.mongo.maxQueryTimeMS=60000 to avoid queries running longer than 1 minute
Sling jobs (e.g. replications) are slow when performed in bulk on a system with AEM6.2 using MongoDB with persistent cache enabled. Can occur during any write heavy activity on the system.
This is caused by limitations of the Oak H2 MVStore implementation used by the persistent cache. On systems with large java heap, the persistent cache degrades performance.
KCS: https://adobe--c.na20.visual.force.com/kA414000000KLwa?lang=en_US
© 2017 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Mongo Issues & Solutions
Warnings like DocumentDiscoveryLiteService-BackgroundWorker-[1]] org.apache.jackrabbit.oak.plugins.document.DocumentDiscoveryLiteService hasBacklog: no lastKnownRevision found
The backlog nodes are the ones that are no longer active, that have finished the recover() but for which a backgroundRead is still pending to read the latest root changes. Can be Ignored.
MongoTimeOutExceptions like:org.apache.jackrabbit.oak.plugins.document.DocumentNodeStoreService(157)] The activate method has thrown an exception (com.mongodb.MongoTimeoutException: Timed out after 10000 ms while waiting for a server that matches Set the connectTimeoutMS to 120000 (2 minutes) in the MongoURI to increase the timeout time. The MongoURI is either set in the start script JVM parameters or in the org.apache.jackrabbit.oak.plugins.document.DocumentNodeStoreService.config file under crx-quickstart/install.
© 2017 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
AEM – Target Facts
© 2017 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
AEM-Target Facts From 6.1 – The Target parsys is broken, once you target a container, it is impossible to
move around/delete/add any nested components inside it. This used to work till AEM 6.0: https://jira.corp.adobe.com/browse/CQ-82200 is being tracked to have this regression fixed in AEM 6.4
AT.js is the new library for client-side integration with Adobe Target, replacing mbox.js. AEM supports using this library along with mbox.js with FP-11577. Server side integration isn’t impacted by the change.
XML-based API is deprecated since AEM 6.2 and JSON-based API is being used now.
A/B testing is not supported in Touch UI
OR condition is not synchronized with Target from AEM and UI does not support it.
© 2017 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
AEM-Target Issues & Solutions When picking the targeting engine, which one to use and why - AEM or Target.? The AEM engine also doesn't have metrics, and is not connected to Marketing Cloud ID which means no shared audiences, no A4T, no customer attributes, etc. It will let you use Context Hub/Client ContextTarget gets metrics and segments which helps in personalization.
© 2017 Adobe Systems Incorporated. All Rights Reserved. Adobe Confidential.
Q&A
THANK YOU!