Category Archives: Search Technologies

Check out my new blog about managing digital content

I started this blog 5 years ago mainly because it was an important tool to spread information about my own lifesituation to co-workers, friends and family. However, as many of you have noticed I also have a passion for technology in general and enterprise content management in particular. I have decided to split my blogs into two different ones. One where I continue to share experience from my daily life and another one where I discuss all things digital content. So if you feel you are more interested in technology, search, EMC Documentum and collaboration you should change links on your own pages to my new blog.

You will find it at: http://contentperspective.se

The Long Tail of Enterprise Content Management

Question: Can we expect a much larger amount of the available content to be consumed or used by at least a few people in the organisations?

Shifting focus from bestsellers to niche markets
In 2006 the editior-in-chief of Wired magazine Chris Andersson published his book called ”The Long Tail – Why the Future of Business is Selling Less of More”. Maybe even the text printed on the top of the cover saying ”How Endless Choice is Creating Unlimted Demand” is the best summary of the book. This might have been said many times before but I felt a strong need to put my reflections into text after reading this book. It put a vital piece of the puzzle in place when seeing the connections to our efforts to implement Enterprise 2.0 within an ECM-context.

Basically Chris Andersson sets out to explain why companies like Amazon, Netflix, Apple iTunes and several others make a lot of money in selling small amounts of a very large set of products. It turns out that out of even millions of songs/books/movies nearly all of them are rented or bought at least once. What makes this possible is comprised out of these things:

Production is democratized which means that the tools and means to produce songs, books and movies is available to almost everybody at a relatively low lost.
– Demoractization of distribution where companies can broker large amount of digital content because there is a very low cost for having a large stock of digital content compared to real products on real shelves in real warehouses.
– Connecting supply and demand so that all this created content meets its potential buyers and the tools for that is search functions, rankings and collaborative reviews.

What this effectivly means is that the hit-culture where everything is focused on a small set of bestsellers is replaced with vast amounts of small niches. That has probably an effect of the society as a whole since the time where a significant amount of the population where exposed to the same thing at the same time is over. That is also reflected in the explosion of the number of specialised TV-channels and TV/video-on-demand services that lets views choose not only which show to watch but also when to watch it.

Early Knowledge Management and the rise of Web 2.0
Back in the late 90-ies Knowledge Management efforts thrived with great aspirations of taking a grip of the knowledge assets of companies and organisations. Although there are many views and definitions of Knowledge Management many of them focused on increasing the capture of knowledge and that the application of that captured knowledge would lead to better efficiency and better business. However, partly because of technical immaturity many of these projects did not reach its ambitous goals.

Five or six years later the landscape has changed completely on the web with the rise of Youtube, Flickr, Google, FaceBook and many other Web 2.0 services. They provided a radically lowered threshold to contribute information and the whole web changed from a focus on consuming information to producing and contributing information. This was in fact just democratization of production but in this case not only products to sell but information of all kind.

Using the large-scale hubs of Youtube, Flickr and Facebook the distribution aspect of the Long Tail was covered since all this new content also was spread in clever ways to friends in our networks or too niche ”consumers” finding info based on tagging and recommendations. Maybe the my friend network in Facebook in essence is a represention of a small niche market who is interested in following what I am contributing (doing).

Social media goes Enterprise
When this effect started spreading beyond the public internet into the corporate network the term Enterprise 2.0 was coined by Andrew McAfee. Inside the enterprise people where starting to share information on a much wider scale than before and in some aspects made the old KM-dreams finally come into being. This time not because of formal management plans but more based on social factors and networking that really inspired people to contribute.

From an Enterprise Content Management perspective this also means that if we can put all this social interaction and generated content on top of an ECM-infrastructure we can achieve far more than just supporting formal workflows, records management and retention demands. The ECM-repository has a possibility to become the backbone to provide all kind of captured knowledge within the enterprise.

The interesting question is if this also marks a cultural change in what types of information that people devoted their attention to. One could argue that traditional ECM-systems provide more of a limited ”hit-oriented” consumption of information. The abscense of good search interfaces, recommendation engines and collaboration probably left most of the information unseen.

Implications for Enterprise Content Management
The social features in Enterprise 2.0 changes all that. Suddenly the same effect on exposure can be seen on enterprise content just as we have seen it on consumer goods. There is no shortage of storage space today. The amount of objects stored is already large but will increase a lot since it is so much easier to contribute. Social features allows exposure of things that have linkages to interests, competencies and networks instead of what the management wants to push. People interested in learning have somewhere to go even for niche interests and those wanting to share can get affirmations when their content is read and commented by others even if it is a small number. Advanced searching and exploitation of social and content analytics can create personalised mashup portals and push notifcations of interesting conent or people.

Could this long tail effect possibly have a difference on the whole knowledge management perspective? This time not from the management aspect of it but rather the learning aspect of it. Can we expect a much larger amount of the available content to be consumed or used by at least a few people in the organisations? Large organisations have a fairly large number or roles and responsibilities to there must reasonably be a great difference in what information they need and with whom they need to share information with. The Long Tail effect in ECM-terms could be a way to illustrate how a much larger percentage of the enterprise content is used and reused. It is not necessarily so that more informtion is better but this can mean more of the right information to more of the right people. Add to that the creative effect of being constantly stimulated by ideas and reflections from others around you and it could be a winning concept.

Sources

Andersson, Chris, ”The Long Tail – Why the Future of Business is Selling Less of More”, 2006
Koernan, Brendan I, ”Driven by Distraction – How Twitter and Facebook make us more productive workers” in Wired Magazine March 20

EMC World 2009: Enterprise Search Server (ESS)

To me one of the biggest news delivered during the conference was the new generation of Documentum full text indexing called the Enterprise Serch Server (ESS). This marks the first official message that EMC Documentum will move away from the OEM-version of FAST ESP which has been in use since Documentum 5.3 (2005). The inclusion of FAST back then meant that Documentum got a solution where metadata from the relational database where merged with text from the content file into an XML-file (FTXML) that could be queried using DQL. Before diving into the features of the new technology I guess everyone wonders about the reason for this decision. The main reasons are said to be:

  • Performance. 1 FAST Full-text node supports up to around 20 Million objects in the repository (some customers commented that their experience were closer to 10 M…) and it requires in memory indices. With Documentum installations containing Billions of objects that means 100+ nodes and that has been a hard sell in terms of hardware requirements.
  • Virtualisation. Apparently talks with Microsoft/FAST about the requirement on supportin all Documentum products on VMWare made no progress. This has been a customer demand for some time. MS/FAST cites intensive I/O-demands as a reason why they where not interested in certifying the full-text index on virtualisation.
  • NAS-support.
  • More flexible High Availability (HA) options. Today FAST can be clustered by adding new nodes which leads to a requirement of having the same amount of nodes for backup/high availability.

From a performance stand-point I personally think that the current implementation of FAST lead to slow end-user experience when searching in Documentum. One reason for this is that a search is first triggered to FAST which then delivers a search result set irrespective of my permissions. Instead the whole result set must be filtered by quering it towards the relational database. That takes time. This is also a reason why we have integrated an external search engone based on the more modern FAST ESP 5.x server with Security Access Module which means that acl:s are indexed and filtering can be done in one step when searching in the external FAST Search Front-end (SFE). More about how that is solved in ESS later on.

From a business perspective EMC outlines these challenges they see a need to satisfy:

  • End users expect Google/Yahoo search paradigms
  • IT-managers want low cost, scalable, ease of deployment and easy admininstration.
  • Requirements for large scale, distributed deployments with multiingual support.
  • Enterprise requirements such as low cost HA, backup/restore and SAN/NAS-suppprt.

New new ESS is based on the xDb technology coming from the aquisition of the company X-hive and leveraging the open source full-text indexing technology in the Lucene project. The goal for ESS is to leverage the existing open indexing architecture in Documentum. The idea is both to create a solution that really scales but of course with some trade-offs when it comes to space vs query performance.

ESS supports structured and unstructed search by leveraging best of breeed XML Database and XQuery Standards. It is designed for Enterprise readiness, scalabiity, ingestion throughput and high quality of search as core features. It also provides Advanced Data Management (enables control where placement of data on disk is done) functionality necessary for large scale systems. The intention is to give EMC to continue to develop and provide new search features and functionality required by their customer base.

It is architected for greater scalability and gives smaller footprint than current Full-Text Search as well as scale both horisontally (more nodes) as vertically (more servers on the same node). It is designed to support tens to hundreds of millions of objects per node.

This allows for solutions such as Archiving where there can be Billion+ emails/documents while preserving the high quality of search while still achieving scale. The query response time can be throttled up or down based on needs – priority can be shifted between indexing and quering.

The installation procedure is also simplified and EMC promises that a two node deployment can be up and running in less than 20 minutes. The solution is also designed to easily allow to add new nodes to an installation.

ESS is much more than a simple replacement of the full-text engne. It will focus on deliver these additional features compared to existing solutions:
– Low cost HA (n+1 Server based)
– Disaster Recovery
– Data Mangement
– VMWare Support
– NAS Support
– New Administration Framework

The new admin features includes a new ESS Admin interface which has a look and feel very similar to CenterStage. Since the intention is to support ESS on non-Documentum installation it is a separate web client. The framwoork also supports Web Services, Java API, JMX and it is open for administration using OpenView, Tivoli, MMC etc.

The server consists of:

  • ESS API
  • Indexing Services will have document batching capability, callback support for searchable indication and a Content Processing Pipeline with text extraction and linguistic analysis via CPS.
  • Search Services. This will provide search for meta-data, content or both (XQuery based) as well as multiple search options such as batching, spooling, filters, language, analyser etc. It will return results in a XML format and provides term highlight, summary and relevancy. The thread execution management support multi-query and parallell query. It also includes low level security filtering.
  • Content Processing Services is responsible for language detection, text extraction and linguistic analysis. The CPS can be local or remote (co-located with content for improved performance). It will have a pluggable architecture to support various analysers and/or text extractors. It will include out of the box support for Basis RLP and Apache SnowBall analysers. However only one analyser can be configured per ESS. (My question: Can I have different analysers on different nodes?). Content Processing can be extended by plugins.
  • Node and Data Management Services is the primary interface for all data and node management within ESS. It provides ability to control routing of documents and placements of collections and indices on disk. It deals with index management and supports bind, detach, attach, merge, freeze, read-only etc.
  • Analytics includes API’s and Data model for logging, metrics and auditing, ingestion and search analysis and facet computation services.
  • Admin Services. The example shown was really powerfull very an admin could view all searches made by a user by time and see what time it took to first result set. The one with a longer time could be explored by viewing the query to analyse why it took so long.

Below that the xDB can be found and in the botton the Lucene indices. The whole solution is 100% Java and xDb stores XML Documents in a Persistend DOM formats and support XQuery and XPath. Indices conists of a combination of native B-tree indices + Lucene. The xDb supports single and multi-node architecture and has support for multi-statement transactions and full ACID support. In additon it supports XQFT (see introduction it here) which is a proposed standard extension to XQuery which includes:

  • LQL via a full text entension
  • Logical full-text operator
  • Wildcard option
  • Anyall options
  • Positional filters
  • Score variables

ESS includes native security which means that security is replicated into the search server and security filtering is done on a low level in the xDb database. This means effective searches on large result sets and enables facet computation on entire result sets.

Native facet computation is a key feature in ESS which is of course linked to the new search interface in CenterStage which is based on facets in an iTunes-like interface. Facets are of course nothing new but it is good that EMC has finally realised that it is a powerful but still easy way to give users “advanced search”.

ESS Leverages a Distributed Content Architecture (for instance using BOCS) by only sendning the raw text (DFTXML) over the network instead of the binary file which can be very much larger in many cases (such as big PowerPoint files). ESS also utilizes the new Content Processing Services (CPS) as well as ACS.

The new solutions also makes it possible to do hot backups without taking the index server down before as it is today. Backup and restore can be done on a sub-index level. The new options for High Availability include:

  • Active/active shared data (the only one available for FAST)
  • Active/passive with clusters
  • N+1 Server based

Things I like to see but have not heard yet:

  • Word frequency analysis (word clouds based on document content)
  • Clustering and categorisation (maybe done by Content Intelligence Services)
  • Synonym management
  • Query-expansion management
  • How document similarity is handled by vector-space search (I guess done by Lucene?)
  • Boosting & Blocking of specific content connected to a query
  • Multiple search-views (different settings for synonyms, boost&blocking etc)
  • Visualisation of entity extraction and other annotations
  • Functionality or at least an API to manually edit entity extraction within the index. Semi-automatic solutions are the best.
  • Freshness management.
  • Speech-to-text integration (maybe from Audio/Video Transformation Services)

Personally I think this is a much needed move to really improve the internal search in Documentum and make much better use of the underlying information infrastructure in Documentum. It will be interesting to see what effect this has on Microsoft/FAST ambitions to support the Documentum connector. Maybe the remaining resources (no OEM to develop) can focus on bringing the connector from an old 5.3 API to a modern 6.5 API. I still see a need for utilising multiple search engines but as ESS gains more advanced features the rationale for an expensive external solution can change. The beta for Content Intelligence Studio will be one important step in outlining the overall enterprise search architecture for big ECM-solutions. In this lies of course tracking what Autonomy brings to market in the near future.

Another thing worth mentioning is that I during the past four conferences have heard quite a few complaints about the stability of the current FAST-based full-text index. It crashes/stops reguarly and often without letting anybody knowing it before users start complaing about strange search results.

A public beta will be released in Q3 2009 and customers are invited to participate. Participants will recieve a piece of hardware with the ESS pre-installed and pre-configured and after a few configuration changes in Content Server it should be up an running.

Customers will have the option of upgrading existing FAST full-text index  or run the new ESS side-by-side with FAST. ECM will also market ESS for non-Documentum solutions.

Be sure to also read Word of Pie’s notes as well as my previous notes from FAST Forward 09 around the future of FAST ESP.

Where the FAST Enterprise Search Platform (ESP) is going now…

I have spent the last week in Las Vegas attending the FAST Forward 09 conference. About a year ago the Norvegian company FAST Search & Transfer was acquired by Microsoft and like me customers all over the world wonder what would happen. Some thought it was great to have a huge company with its R&D resources to take the platform forward while others like me feared a technology transition which would include cancelling support for other operating systems and integration with nothing but Microsoft technology.

It was very clear that the Microsoft Marketing department had a lot to say about the conference and what messages that were to be conveyed. Somewhere behind all that you could still see some of the old FAST mentality but it was really toned down. To me the conference was about convincing existing customers that MS is committed to Enterprise Search and to give Sharepoint customers some idea of what Enterprise Search is all about.

It is clear that the product line is diversifying in a common Microsoft strategy:

Solutions for Internet Business

  • FAST Search for Internet Business
  • FAST Search for Sharepoint Internet sites
  • FAST AdMomentum
  • Solutions for Business Productivity

  • FAST Search for Sharepoint
  • FAST Search for Internal Application
  • FAST Search for Sharepoint won’t be available until Office Wave 14 (incl Sharepoint) will be released so in the meantime there will be a product called FAST ESP for Sharepoint that can be used today and will have a license migration path towards FAST Search for Sharepoint. That product will have product license of aroudn 25 000 USD and then additional Client Access License (CAL) will follow in a standrad MS manner.

    So what does all of this means for us who like to see FAST ESP continue as an enterprise component in a heterogenous environment? Well, MS has commited to 10 years of support for current customers, I guess in a gesture towards those who are worried. Over and over again I heard representatives talking about how important those high-end installations on other operating systems are. The same message appeared when it came to connectors and integration with Enterprise Content Management systems like EMC Documentum. Still, most if not all demos was connected to Sharepoint and/or other MS-specific technologies.

    The technical roadmap means that the past year has been devoted in rewriting their next generation search platform from Java to .Net. The first product that will be released is the Content Integration Studio (CIS) which consist of Visual Studio (I guess earlier in Eclipse) component and a server-side execution engine. This will only be available on Windows since it is deeply connected to the .Net-environment. It looks like a promising product with support for flows instead of linear pipeline to handle the processing of information before it is handed of to the index engine. CIS therefore sits in-front of FAST ESP and a combination of actions in flow and in old pipelines can be executed. Information from CIS is written to the ESP which then creates the index and also processes queries to it.

    What I think we can expect is that new innovation is focused on creating a modular architecture where CIS is the first one. Features in ESP will the be gradually reengineered in a .Net-environment and thus creating a common search platform some years into the future. It will likely mean that we will still see one or two upgrades to the core ESP as we know it today to enable it to function together with the new components. Content Fusion will most likely be the next module that will extend ESP but on a .Net-architecture.

    When it comes to the presentation logic where we today have the FAST Search Front-End (SFE) we will see them either as Web parts for Sharepoint or as AJAX Aerogel from MS. These are currently developed using Javascript but will include Silverlight later on.

    These will initially be offered in both a IIS and a Tomcat flavour and possibly others if there is demand. They will intitially integrated with ESP and Unity and thus opening up for a new approach of developing a search experience on top of them.

    I general I don’t like the Microsoft approach of insisting of owning the whole technology stack by themselves and refusing to invest in other standards-based projects. Instead of developing their own AJAX libraries they could have used ExtJS or even Google Web Toolkit. While it is not open source MS argues that it is a very Permissive licence from MS that has many of the same qualities. A good thing is that MS was comitted to make sure that this framework works on all major browsers including FireFox, Safari and Chrome. It is interoperable with JQuery.

    In summary I think it is kind of a mixed experience. The new features being developed are truly needed to make FAST keep being one of the most advanced search engines available. I think many of the features look really promising and I can’t wait to get my hands on then. On the other hand it is clear that things are going proprietary (FAST ESP had a lot of open source in it), it is being aligned in a Microsoft stack and thus gradually minimizing options. That includes how new technologies are being implemented (MS-ones instead of open source), what operating systems it will run on and how the support for developing presenation logics look like. It means I have to have people how know both Java and .Net, both Flash and Silverlight (possibly JavaFx) and both ExtJS/GWT and MS AJAX/Aerogel.

    We are deeply invested in the EMC Documentum Platform and would of course like to continue use ESP as a way to add advanced capabilities and performance to our architecture. However, I think I will over time get sick and tired on Microsoft sales people trying to convince me to use Sharepoint instead of Documentum. For anybody who know how both platform work it is almost a joke but I will most likely have to keep explaining and explaining. I just hope that we can have decent connector developed for Documentum.

    Too read more you can go to the FAST Forward Blog which has many interviews, look at videos at the Microsoft Press Room and check out the chatter on ffc09 tagged tweets on Twitter. An finally here is what CMS Watch has to say about it.

    First impressions of Documentum Digital Asset Manager 6.5

    Before going on my planned sick-leave I played around with DAM 6.5 for a while. I will try to summarize a few reflections I have on this brand new release.

    Good things
    The interface have got yet another refresh but rather small modifications that I guess I won’t even notice in a couple of weeks. The biggest change is that some functions have got modal windows meaning that when you click on properties you no longer see the big full screen page but instead a new browser window that allows you to see where you where when you clicked. A great improvement I think. The import/export/check-in process also has small modal windows with a nice looking update progress bar.

    A thing that I just love is the new clusters/facets features which appears when performing a search. Your results can then be drilldowned based on user, topic, date and so forth. Will improve findability hugely. We had these installed in D6 SP0 but they did not work then and seem to be connected more closely to ECI Services back then.

    In general the interface is prettier and looks more distinct and modern. The icons have been slightly improved as well.
    Another small improvement is that attributes which have both value assistance (dropdowns) but also allows entering of an own value now have the correct width.

    I guess it is not really connected to this upgrade but I finally manage to find how one creates Presets (rules) for specific folders and users which was great. Look at the three structure in DA – not in the menu.

    Bad things
    The left tree structure has been cleaned up with clearer icons and the update is based on AJAX (or should see bad below). This works fine in Documentum Administrator 6.5 but for some reason they seemed to have missed something in compiling DAM because there is small refresh anyway when you click on a folder. Our partner suggest that they simply have inherited from the wrong WDK-class.

    Another interesting thing is that some features that are highly marketed at EMC World are turned off by default in the configuration files. Those include Deep Export and OLE-linking support (resolving links in Office documents and imports associated files if desired). That is rather strange I think since those are really handy features. The OLE-linking can also be toggled on/off in Preferences. The effect of that is that there was no folder export available at all which is fairly strange. We also had some issues with getting import of more than two folders working.

    We also have a an irritating issue around thumbnails. It seems that those can not be created for PDF-files at all which also means no storyboarding. When reading through the release notes this is noted as a known bug and it seems that despite our bug report from earlier this year nothing has been done to fix it. From a usability standpoint that is not so good.

    EMC Documentum CenterStage

    If you haven’t done it already I recommend a look at the site for the beta of EMC Documentum’s new web client called CenterStage. This modern Web 2.0 client has earlier been called both Magellan and IntelliSpace but EMC now seem to have settled on the name CenterStage. It is kind of funny because I associate CenterStage with a TV application for Mac OS X which is found at the CenterStage Project site. Anyway it is interesting to see how the interfaces and feature looks for the free CenterStage Essentials (included in any Content Server license) and the paid version called CenterStage Pro. I have long waited to for a good application that both can do “Facebook for the Enterprise” while still having all the features of an advanced and full-fledged Enterprise Content Management platform. This seem to be a big step towards that. The key thing is to be able to collaborate both around content (documents etc) but also around people, groups and projects. Although there are good collaboration platforms out there such as Clearspace which has some basic integration with Documentum it is still creating a lot of duplicate information in separate “stove-pipes”. I want the content objects found in the Clearspace platform stored in Documentum but this is not the case today. What we are looking at is being able to search Documentum content from Clearspace for the immediate future.

    Again, back to Centerstage I believe it will provide a lot of organisations with a client that will be a lot more intuitive and useful out of the box than we have ever seen from Documentum before. This is thanks to an ambitions usability project run by Gideon Ansell in the Documentum User experience group. However, after having had a looked at the project release matrix found in the beta community it looks like the beta of CenterStage essentials will not have enough features to be the flexible collaboration client I need. Those features will be added later on this year. We just have to wait for the full CenterStage Pro version I think. I also hope that the few missing pieces like a full fledged personal profile, expert location and integration with external presence/Instant Messaging systems will be on the schedule for the next update of it.

    This week I will be able to play with the Documentum 6.5 release for the first time since it is being installed by our Documentum partners at work. I especially look forward to see the new Digital Asset Manager (DAM) 6.5 client and TaskSpace 6.5. I also hope that I can get some further information about what release of the embedded FAST InStream Search engine is used in this release.

    Other EMC World 2008 references

    I love air conditioning systems but they once again seem to have given me cold so I am sneezing and coughing all the time and use aspirin (Alvedon in Swedish) to get along. It is a bit sad since I just love being here at the conference.

    Another reflection is that conferences like this involves a lot of walking. These three days I have been walking around 5 km a day at least. Actually more than I walk a normal work day 🙂

    Met Laurence from Word of Pie the other day and he has been writing excellent notes from other sessions that I recommend reading:

    EMC World 2008: ECM Shared Services in the real world
    Random thoughs and Keynote
    EMC World 2008: Documentum Performance, Scalability, and Sizing – Part 2
    EMC World 2008: Introduction to EMC’s Next-Generation Knowledge Worker Client
    EMC World 2008: Web 2.0 and Interactive Content Management
    EMC World 2008: Documentum Foundation Services (DFS) – Best Practices and Real World Examples
    EMC World 2008: Social Computing Meets R&D
    Thoughts on EMC World 2008 and the ECM Professional
    EMC World 2008: Best Practices for Designing and Deploying an Enterprise Document Capture Solution
    EMC World 2008: Documentum Architecture Deep Dive

    EMC World 2008: D6 Webtop – Focus on Knowledge Workers

    Presented by Peggy Ringhausen Principal Product Manager during EMC World 2008

    Different areas:
    – Simple to use (Any content type
    – Searchable (Flexible, federatedm consolidated results)
    – Collaborative (Team-oriented, extended enterprise, secure)
    – Agile (available anywhere, contextual, integrated)

    Presets in D6 – configuration – even more in D7
    Better preferences
    Saved searched improved and search templates
    WebTop 6.5 in late July

    She talked about features already available.
    Subscribe other people to content
    Preferences is persistant – no longer any cookies on the client.

    Presets allows you to pick a target and set rules.
    A Select a folder. Set it up to only allow certain object.-types to be created in that folder.
    Only allow certain actions om certain folder.

    Extended search is an optional add-on to WebTop – creates clusters.
    Clusters can be created based on certain attributes.
    Search templates also part of extended search – allows to make some of the values optional and some fixed.

    Collaboration Environment (DCE) is now bundled with WebTop. License key still required though.
    Data Tables is also available through Collaboration and 6.5 also allow attachments to data tables
    Events in the calendar object can be imported through iCal exports.

    D6 SP1 – OLE Link support is optionable.
    That feature checks if there are linked objects and imports these documents to and create a virtual documents out of all these items. The same thing during exports.
    A checkbox on the import screen to also act on linked documents.

    WebTop D6.5

    Email conversion to EMF-format. Converting everything to the same parent object
    Conversion tools to convert all emails to subtype dm_message_archive, lightweight html-format called EMC Email Format. Email viewing tool based on HTML for viewing email and attachment without having to export.

    Page Refresh Reduction – reduced as many as possible. A really good thing I think.

    Modal Dialogs can be turned off. Brings up small new browser windows to see the context. Properties with its own window. Instead of the usual Documentum-screens that fills the browser window.

    Multi-select Drag and Drop is supported.

    HTTP or UCF Choice enhancement.

    Deep Export. This is great and I can’t really understand why it took them so long.

    Content Transfer Improvement (multithreaded streaming)
    New import screen. Small window with green bar. No longer a white screen which is a great user interface improvement. That screen had a tendency to scare people a little bit

    Security Testing has been extended. Promised no level 1 or 2:s…

    Take the WDK components and pull them into a new container and give users a new UI.
    The new UI of the WebTop is optional.
    Looks a little like Outlook. Collapsable parts (bars) on the left.No huge tree structure any more.
    The tabs I saw were:
    – Search center
    – Subscriptions
    – Home Cabinet

    User configurable home page. New column to the right for properties, versions, comments. A great improvement since it almost provides a portal page within WebTop.

    Contextual right click menus.

    Offline client – My Documentum Offline (OEM product) Available end of July and beginning of August.
    D6 SP1 release. Free of charge for any user using WebTop.
    – The My Documentum Folder (provides access to the latest versions of documents when not connected)
    – Synchronize (Choose specific documents, folders, and subscriptions)
    – Personalize (Tailor to suit individual needs)
    – Resolve issues (Mechanism to resolve conflicts that arise during synchronization)
    Offline client has a small Jet-DB on the client to hold the metadata.

    Still a bit of an overlap between File Sharing Services (FSS) and Offline client though. Will most likely be merged in someway in a D7 timeframe.

    Learned later on that all these new feature will be available in DAM 6.5 as well since DAM is just an extension of WebTop.

    The vision for the Modern Knowledge Worker

    Presented during EMC World 2008 by: John McCormick, GM Knowledge Worker Business Unit

    Today’s Problem: An information Explosion
    The nature of content management has changed, IM, email, videos etc
    Spend so much time finding things…

    Who is the Knowledge Worker?
    – Work in different or remote locations
    – Create and work with a variety of different content
    – Engage in dynamic work processes that change frequently
    – Work in teams to get their job done
    – Need access to managed content in their everyday application

    A wide variety of interface for people to work from, spaces, wikis, powerpoint, outlook etc

    KW Challenges
    – Proliferation of information silos
    – Work in many dispersed teams
    – Finding the right information
    – Seeing the relationships between types of information
    – Organizing and sharing information
    – Ensuring information is always accurate
    – Adhering to IT-requirements for compliance and governance

    IT Challenges
    – Volume both in volume but also types of information
    – Users are always connected – how to you do maintenance and upgrades
    – Enpowerment – expect on-the-fly customization capabilities
    – Control – Corporate regulatory concerns

    Traditional KW Solutions
    – Create silos of information
    – Same info stored in multiple places
    – Multiple search engines & queries
    – Users re-invent wheel out of frustration
    – Create solutions
    – Complex interfaces are more than what users require
    – No immediate access from remote locations
    – Rely on yesterday’s technology (email, shared drives)

    Four pillars of KW
    – The Platform for Web 2.0
    – Web 2.0 client
    – Intelligence from information
    – Access anywhere

    The Platform for Web 2.0

    Wikis, blogs, RSS managed as objects
    Everything exposed as web services
    Can be leveraged in any UI (purpose-built, partner, portal etc.)

    Enterprise Scale
    Built on a repository that can scale to billions
    Wide array of platform services
    Can interoperate with other CMA soltions like TCM
    Any object can be retained, made a record, archived, published

    Services-enables
    Available to .Net and open environment
    From a very chatty API to a less chatt SOA-based interface
    Support a wide array of dispersed networks through BOCS
    Extensible services for added functionality

    Vision for Enterprise CM with Web 2.0
    – Author & Publish (Blogger, Youtube, Wikipedia, Flickr) – Ratings on Content, IRM Security. Team Wikis, Collaboration
    – Organize & Manage (Digg, del.icio.us) – Guided navigation, Tagging of items, Classification, Personalized Views
    – Network & Access (LinkedIn, Facebook, Myspace, iPhones) – Enterprise Ready, Secured off-network, Mobile access, scalable infrastructure, Retention & governance

    Pre-configured:
    – Object models
    – Taxonomies
    – Business Processes
    — User Experiences
    – Retention Policies

    BOCS Make sense for the KW Platform – ease the user experiences
    BPM
    Retention Policy Services
    De-duplication
    Archive

    Web 2.0 Client

    Personalized
    – Simple to configure
    – Include information to you
    – Easy to use interface

    Team thru Enterprise (Scale)
    – Customizable team workspaces and templates improve efficiency
    – User Management of Communities
    – Ability to locate experts within an organization

    Extendable
    – Ability to mashup external information sources
    – Components can be extended & created by partners and customers

    Magellan Essentials
    – No cost client
    – Team workspaces
    – Access control
    – Library Services
    – Guided navigation
    – Content Templates
    – Lifecycles

    Full client
    – Low cost client
    – Wikis, Blogs & RSS
    – Extranet Support
    – Personal Spaces (Team Members)
    – Tagging (Tag clouds)
    – Federated Search
    – Visualization
    – Workflow

    Multiple Patterns of Collaboration Supported

    – Org/LOB/Deparmental
    – Team & Project Oriented
    – Individual (Ideation)

    Information Intelligence

    -Expansive Search
    Both through EMC or own UI for ECIS
    – Analyze & Classify (spot key concepts, detect relationships accross information assets
    -Visualize (timeframe etc)

    Smart Searching
    – Indexing (real-time search results)
    – Security (even for outside sources)
    – Scalability

    Tagging
    – User created
    – Folksonomies
    – Change over time
    Rule.based classification
    – Metadata qualifiers
    – Confidence weights
    Semantics.-based classification
    – Better linguistics support
    – Derives the “gist” of the document
    – Monitors designated information sources and provides updates
    – Extracts key insights from text based on linguistics and indexing technologies

    Visualization (Recommended, Indexed, Personalized, Aggregated, Guided)
    – Expertise Location
    – Mash-ups (google maps)
    – Personalized Navigations
    – Tag clouds

    Work Wherever
    – Offline support (synched to My Documents folder)
    Use familiar tools
    – MS Office, Adobe Creative Suite and beyond

    Bring ECM to the desktop
    – get control of email and files on the desktop
    – Enforce corporate policies

    Used iPod-iTunes metaphors for how the mobile client worked 🙂

    EMC World 2008 Day 1 Part One

    The conference has started and I was up early to go to the first seminar which was called Effective Classification – From data to information but it was mainly focused around EMCs storage products and low level classification and metadata management. More or less nothing about Documentum during the first 20 minutes so I got bored and went over to Introduction to Transactional Content Management instead. That was much better and was an overview of EMCs offerings around BPM and content management. Of course focused around the traditional examples like claims management and with integration with document capture products like Captive. However, I am very much interested in TaskSpace which provides a good streamlined interface for workflows but with inline preview of associated objects. I also had no idea that there was a solution for direct-attached scanners using a web client.

    After that I had a meeting with David LeStrat and Gideon Ansell. David is PM for the new Magellan client and Gideon works with usability issues in the Documentum product line. We met last year where we provided some info around our network visualization technologies. We presented our project and what we trying to achieve and talked about how influences from the Web 2.0 movement could be used in the Documentum platform. Personalization is important just as the personal page is to actually make the user a node in the system. The current WebTop/DAM clients offers an good interface to interact with content object but still mainly around a folder structure idea. However, there is no place to enter my personal details and my skills. Magellan will offer that and in a sense provide the first step towards a community idea. I also mentioned the need to integrate Magellan with an Instant Messaging Solution. Finally we talked a little about our aspiration to provide a GIS-oriented interface to consume objects with geocoordinates. Currently this is just a mashup based on Google Maps but we of course need a solution that works without internet access. See upcoming posts for more information about Magellan and the other new interfaces.