fisl: content management primer

44
A Content Management Primer: What I Wish I Knew Richard Esplin Community Technology

Upload: resplin

Post on 08-May-2015

1.080 views

Category:

Technology


0 download

DESCRIPTION

An introduction to building content applications using the CMIS API. Presented at FISL13.

TRANSCRIPT

Page 1: FISL: Content Management Primer

A Content Management Primer:What I Wish I Knew

Richard Esplin

Community Technology

Page 2: FISL: Content Management Primer

Patterns for Handling Content in Applications

Richard Esplin

Community Technology

Page 3: FISL: Content Management Primer

Why Relational Won't Cut It

Richard Esplin

Community Technology

Page 4: FISL: Content Management Primer

Solving SharePoint Type Problems With An Open Source Stack

Richard Esplin

Community Technology

Page 5: FISL: Content Management Primer

Agenda

● Making the case for content management● Best practices: the platform approach● Introducing CMIS● Live examples

Page 6: FISL: Content Management Primer

What is Alfresco?

Enterprise content management platform across cloud, on-premise, or both

API for content applications that can run in the cloud, on-premise, or both

Content hub for your enterprise tablets1cloud on-premise hybrid cloud sync

Page 7: FISL: Content Management Primer

What is “content”?

● Data● Don't mistake Code for Content

● Unstructured Data● Structured data works well in a relational data store, XML store, or

key-value store

● Unstructured Binary Data● Unstructured non-binary data works well in source control

● Examples:● Audio, Video, Images, Office Documents, Engineering Files,

Reports

Page 8: FISL: Content Management Primer

What is a “content-centric application”?

● Applications that access binary files● Files are often generated collaboratively● Often must deal with large numbers of files● May include a mix of structured and unstructured

content● May also include business processes

Page 9: FISL: Content Management Primer

A few examples

● Web site with catalogs, white papers, and videos● Expense report review and approval● Contract negotiation, creation, and review● Research study authoring● Sales / Marketing collateral creation and communication● Course guide authoring and publishing● Images and media in games● Media curation, transformation, and delivery● Legal compliance and corporate records management

Page 10: FISL: Content Management Primer

Or the business is saying . . .

● I've got a ton of files,● I've got people that

produce and consume them,

● I've got systems that use them,

● I want to make it easier!

Doug Waldron (cc attribution share-alike)http://www.flickr.com/photos/dougww/922328173/

Page 11: FISL: Content Management Primer

Let's build it ourselves!

Pasukaru76 (cc attribution) http://www.flickr.com/photos/pasukaru76/4277763808/

Page 12: FISL: Content Management Primer

DIY approach seems simple . . .

● “This is simple stuff.”● Grab a web-application toolkit● Favorite front-end / presentation framework● Store a bunch of files● Relational Database

● Data Model / Metadata● Comments / Ratings● Tagging / Categorization

Page 13: FISL: Content Management Primer

File storage options

● On disk● Amazon S3 or an internal CAS filer● Source code control repository● XML database● NoSQL document store

Page 14: FISL: Content Management Primer

Relational may not cut it

● Good at text and numbers. Not so good at binary.

● Good at static table definitions. Not so good at dynamic aspects.

● Size limits.● Random seek (streaming).● Search: Some relational databases can index

into blobs, but not all.

Page 15: FISL: Content Management Primer

Once files are figured out . . .

● Ensure security● Execute a workflow● Transform the content between

types● Schedule a job● Provide shared drive access● Versioning● Replication● API Access● Integrate with authoring tools

Lotsof

custom code!

Page 16: FISL: Content Management Primer

The optimistic scenario

gobucks2 (cc attribution non-commercial share-alike) http://www.flickr.com/photos/69331170@N00/2854583096

Page 17: FISL: Content Management Primer

The pessimistic scenario

http://commons.wikimedia.org/wiki/File:Professor_Lucifer_Butts.gif

Page 18: FISL: Content Management Primer

Evaluating DIY reasonableness

● Number and size of documents● Number and concurrency of users● Number and nature of integration points● Business process volatility and complexity● Time and cost of

● Integrating all of these services / sub-systems● Maintaining all of that code . . . forever

● Access to off-the-shelf alternatives

Page 19: FISL: Content Management Primer

Introducing the content repository

● Content = a file + metadata● File system

● Content binaries● Search indexes

● Database● Relations (associations)● Metadata

● Repository● Abstraction layer

Page 20: FISL: Content Management Primer

Components of content-centric systems● User Interface● Persistence / Data Model

/ Metadata● Business Process /

Workflow● Library Services

(Upload / Download, Versioning, Check-in / Check-out)

● Security● Search● Scheduler

● Transformation / Rendition / Thumbnails

● Tagging / Categorization● Authoring tool integration● Remote API● Transfer / Publication● Comments● Ratings● Activity Streams /

Notification● Quotas

Page 21: FISL: Content Management Primer

Packaged systems

Page 22: FISL: Content Management Primer

Open source content management

● Alfresco● Nuxeo● Knowledge Tree● Magnolia● Apache Jackrabbit● Plone

● (cmis4plone)

Page 23: FISL: Content Management Primer

Best Practice: The Platform Approach

Page 24: FISL: Content Management Primer

Platform approach

● The common problems have been solved● Content Platform = Repository + Services

● Find a platform that meets your needs● Extend the platform with your own business logic● Customize the UI that the platform provides● Or write your own front-end using whatever language or

framework makes sense

● Meets your current needs while providing a roadmap for the future

Page 25: FISL: Content Management Primer

Evaluating content platforms

● Agility● Applicable to a broad set

of solutions vs a vertical specific solution

● Scale up, scale down● Developer ergonomics

● Fast and friendly developer model

● Open Source● Troubleshooting● Bug tracking● Community

● Standards compliance● Easier integration● Lower migration costs● Developer familiarity

Page 26: FISL: Content Management Primer

General architecture

Web Applications Knowledge Portals Web Services

Virtual File System High Availability

BusinessProcessEngine

CRM

Portal Server

AppServer

Page 27: FISL: Content Management Primer

Desktop

Mobile

Social Media Channels

Web Services Public Alfresco Cloud

Corporate Systems

Open Web APIs

CMISJSR-168

Connectors

WebDAVCMISCIFS

SharePointProtocol

Open WebAPIsCMIS

CMIS-basedAlfresco Sync

CMISWebDAV

Page 28: FISL: Content Management Primer

andand

Page 29: FISL: Content Management Primer

What is CMIS?

● Content Management Interoperability Services

● Language-independent, vendor-neutral API for content management

● Least-common-denominator (some vendors have extensions)● CRUD functions for nodes● Check-in / check-out● Associations● Permissions (Access Control Lists)● Policies● Queries● Repository Traversal

Page 30: FISL: Content Management Primer

What is CMIS?

● OASIS standard● 30+ ECM vendors agreed to implement

● Two parts● Interoperability through standard SOAP and AtomPub

bindings– JSON bindings coming soon

● SQL-based query language for rich content repositories

● Vendor specific extensions may be useful

Page 31: FISL: Content Management Primer

Use cases

● Collaborative content creation

● Portals

● Client application Integration

● Mashups

● Embedded content store

Client

Content Repository

Content Repository

Content Repository

Client

Content RepositoryContent

RepositoryContent Repository

● Workflow & BPM● Archival● Documents generation● Digital Asset Management (DAM)● Web Content Mangaement (WCM)

Page 32: FISL: Content Management Primer

The beauty of CMIS

?

Presentation Tier

Content Services Tier

?Enterprise Apps Tier

REST SOAP

Page 33: FISL: Content Management Primer

Meet CMIS

Client

Content Repository

Services

Domain Model

read write

Con

sum

er

Pro

vid

er

Vendor Mapping

ContentManagementInteroperabilityServices

CMIS lets you read, search, write, update, delete, version, control, … content and metadata!

Page 34: FISL: Content Management Primer

Types

Document● Content● Renditions● Version History

Folder● Container● Hierarchy● Filing

Relationship● Source Object● Target Object

ACL● Target Object

Described byType Definitions

Policy● Target Object

Page 35: FISL: Content Management Primer

Type Definitions

*

Custom Type

Object● Type Id● Parent● Display Name● Queryable● Controllable

Document● Versionable● Allow Content

Folder Relationship● Source Types● Target Types

Policy

Property● Property Id● Display Name● Type● Required● Default Value● …

Page 36: FISL: Content Management Primer

Apache Chemistry

● Open Source implementations of CMIS● Umbrella project for all CMIS related projects within the

ASF● OpenCMIS (Java, client and server)● cmislib (Python, client)● phpclient (PHP, client)● DotCMIS (.NET, client)

● De-facto reference for CMIS and used by CMIS technical committee to test 1.1 features

Page 37: FISL: Content Management Primer

Examples

Page 38: FISL: Content Management Primer

My setup

● Debian Mint Wheezy● OpenJDK 1.6.0_24● Python 2.7.2● Alfresco Community Edition 4.0.d● Open CMIS Workbench 0.7.0

Page 39: FISL: Content Management Primer

CMIS Workbench

● Download● http://chemistry.apache.org/java/developing/tools

/dev-tools-workbench.html● Connect to Alfresco

● http://localhost:8080/alfresco/cmisatom● Good tool for figuring out what CMIS can do● Check out the Groovy Console!

Page 40: FISL: Content Management Primer

Python● In the shell:

virtualenv . ./bin/easy_install cmislib ./bin/python

from cmislib.model import CmisClient client = CmisClient( "http://192.168.56.1:8080/alfresco/cmisatom", "admin", "admin") repo = client.defaultRepository repo.id repo.name for (k,v) in repo.getCapabilities().iteritems(): print "%s: %s" %(k,v)

for (k,v) in repo.getRepositoryInfo().iteritems(): print "%s: %s" %(k,v)

root = repo.getRootFolder() root.name folder = root.createFolder('cmis-demo') folder.id folder.name for (k,v) in folder.properties.iteritems(): print "%s: %s" %(k,v)

● Continued:

props = {}props["cmis:objectTypeId"]="cmis:document"doc = folder.createDocumentFromString('testdoc.txt', props, contentString="This is a test showing how to create a text document", contentType='text/plain')doc.isCheckedOut()props = {}props['cmis:name'] = "test-updated.txt"doc = doc.updateProperties(props)doc.namedoc.delete()len(folder.getChildren())result = repo.query("select * from cmis:folder where cmis:name like '%alf%'")len(result)for i in result: print i.name

result = repo.query("select * from cmis:document where contains('name')")for i in result: print i.name

Page 41: FISL: Content Management Primer

PHP and Drupal

● Drupal CMIS Views● http://drupal.org/project/cmis_views

● Built on Drupal CMIS● http://drupal.org/project/cmis● Configure a repository in settings.php● Enable cmis_sync● Bundles an early release of phplib

● Currently read-only● Good for exposing unstructured data alongside a

structured web page

Page 42: FISL: Content Management Primer

Where to learn more

● cmis.alfresco.com includes a public CMIS server and links to CMIS resources (check out the cheet sheet)

● Read the CMIS specification● Apache Chemistry site has clients, lightweight server,

documentation● “Getting Started with CMIS” tutorial shows how to use

"cURL to hit AtomPub bindings directly"● Slideshare has some CMIS related presentations from

Alfresco DevCon here and here

Page 43: FISL: Content Management Primer

Questions?

Page 44: FISL: Content Management Primer

Attribution and Licensing

● Copyright 2012, Alfresco Software● Some images used in this presentation are

licensed under the Creative Commons by-attribution non-commercial share-alike license.

● Original work in this presentation is licensed under the Creative Commons by-attribution license.

● Thanks to Jeff Potts for allowing me to base my presentation on his.