Our first article in the series focused on the basic Enterprise Content Management (ECM) concepts. Next we will explore the core functionality of ECM systems that can be used to implement ECM practices. These are commonly referred to as “library services”.
The term may inspire visions of school and college days spent in dusty bookshelves where you lined up at a counter and checked out books. The electronic concept is much more clean and efficient but it does hold some common themes. One is the concept of unique file identification. Since each file or document has a unique identifier of some sort, that is stored in a database, it can be tracked separately. This allows the system to control who can edit or update a file at any given time. This is the core functionality underlying “version control”.
Version control seems simple on the surface, but open the hood and things get complicated. To keep it simple at this stage we will examine the functionality or “what” rather than the “how”. We will also discuss the concept of “check out” and check in”. The best way to describe it is to provide a step by step example.
A typical example of version control:
- User creates a new document using their productivity software and saves it to their local hard drive with a name such as “My First Doc”.txt.
- The user logs into their ECM system. We will explore authentication later.
- The user chooses the location for the file in the ECM system. We will go deeper into this later.
- The user places the file in that location by either an import or check in function. This usually happens by either selecting the location and using the user interface to open an import or check in dialog.
- The user interface opens a dialog with fields containing data describing the file. This is called metadata.
- The user completes the data entry.
- The user fills in any required information on the dialog and completes the process.
- The user interface finds the content file on the local machine and sends it to the server for the ECM system, usually called a “content server”.
- The system stores the content file, either in the database or on a file store, and then updates the database record for that file with the information that was provided.
- The user now looks at the ECM user interface and sees their recently added file.
- The user realizes that they forgot something in the document and needs to edit the text and update it.
- The user selects the file and chooses a function from the user interface to edit or check out the file.
- The system looks up the data for the file and checks to see if the user has permissions to read or change it. If so, the file is copied to the user’s local machine, saved on disk and opened in the appropriate application.
- The end user makes the changes and saves the file.
- The application saves it to a default location on the local machine or network. We will talk about Cloud systems later.
- The end user clicks on the listing for the file and selects an option like check in or update.
- The user interface then fetches the saved file from the local machine and sends it to the content server.
- The content server imports the new file, either over-writing the old one or saving a new one with the same or different name. This is where things get ‘interesting’ since the “or” is a big decision. This will depend on the permissions that the end user has on the file in this situation. The permission or access control in ECM systems can be very sophisticated, depending on the software and/or the configuration of the system. Typically, if a user has ‘write’ permission they can over-write the existing file and leave it as version 1. If they do not have write permission but have ‘version’ permission then they will need to create a new copy of the file as a version. We will assume that they only have version permission in this case.
- Assuming that the user will be ‘versioning’ the file, the content server stores the new file and associates this new file with the old file, using the database tables. The method varies between different systems but the end result is that the first file has the version number identified as version 1 and the second as version 2, or in some cases 1.1.
- The user checks the user interface and sees version 2 of the document displayed. Most systems default to the “current” or latest version.
- The end user chooses to view all versions in the user interface and sees a list of version 1 and version 2 of the document.
So why would we go to all that trouble? There are several reasons and they are related to multiple users accessing the same content file at the same time. The basic concept works like checking a book out of a library. Nobody else can modify it while you have it checked out. In the case of an electronic library the source file itself is not removed from the library, but a copy is made and sent to the end user requesting it. The database entry of “checked out” will contain a yes or no (1 or 0) and if it is checked out another user can come along and read it but cannot check it out. Once it is checked out nobody else can edit or check in a new version of the document. Only one person can check them out at a time.
This prevents one user from wiping out the work of another user and enforces a pattern of behavior where users take turns to edit, markup and review documents, each with their own version.
There are big exceptions to that pattern where the software provider of the office application is also the provider of the ECM software. In that case, due to the deep integration possible with the editing application, they can allow multiple users to edit a single document at the same time, without wiping out each other’s work. This is the exception to the general rule.
As a result the typical behavior is for a user to create a document, check it into the system and release the checked out status. If someone else wants to update the document they need to check it out, edit the document, save it, and then check it back in. This way you end up with multiple versions of a document like a stack of paper, with the most current one on top.
Another aspect of library services is the ability to find the information. In physical libraries, prior to computers, there used to be large cabinets full of index cards with listings of the documents or books, each with unique ID codes, and then you would need to follow the codes, marked on the shelves and books, to find, or not find your book. Now, in ECM, each document or file gets a unique identifier and also metadata describing it. ECM systems have “search engines” to help search through the data and present you with a list of files and or folders meeting your criteria. There is another search method, that we will discuss later, called full text indexing. This search method creates indexes of the contents of text files and searches against both the metadata and the full text indexes.
Another way of finding information is by folder structure. This is also referred to as taxonomy. This allows the user to browse through logical structures of information much like using Windows Explorer or other file system browsing tools. Another article will be dedicated to creating taxonomies, but suffice to say, they are a critical part of an ECM system for several reasons.
Another aspect of library services is the decision making process of providing or denying access to information. This is typically associated with folder structures or taxonomy design and is a part of the decision making process during the design and implementation of such a system. Access controls usually include the following:
- User – A user ID and profile can include a username that is associated with an authentication system used in the organization. Authentication is a different sort of topic than the user in an ECM, but they can be related by using an authentication system like Active Directory or LDAP. There is also Single Sign On (SSO) where the login function is automated. More on that later. An email address is usually required for notifications from the system.
- Groups – A group is basically a container for a list of user names. Sometimes, depending on the system, you can place groups within groups. Groups tend not to have any differentiation other than access control.
- Roles – Roles can be used either in access control but also in how the user interface presents data to the end user. For example, you may define a role called ‘managers’ and when they login they see a dashboard of workflow activity associated with their staff or function. Not all systems have this capability but most do.
- Access Control Lists – ACLs are sets of rules defining access to the files and or folders. Typically they will be a list of groups and roles and the associated permissions each has on the attached content objects which are either folders or files. Then each file listing in the database has an associated ACL or collection of ACLs attached to it. When the content server checks to see if an end user can access the content it takes the user name and looks in the groups and roles of the ACL to determine what access the end user may have. The ACL acts as a filter for the end user which defines what they can see in the system.
- Permissions – When defining access there are a number of different controls for activities that can be set. Some systems are more detailed in how permissions can be set. The most basic permissions are a list such as “None”, “Read”, “Annotate”, “Version”, “Write”, “Delete”. Other more advanced systems may offer features such as control of whether the end user can modify metadata, or even certain fields in the metadata.
When these different aspects of the core library services are combined, you have a powerful platform where information can be organized, managed and found without the possibility of security breaches. Different systems provide different combinations of these basic services. Often the top vendors offer complete ECM platforms with a wide range of “bolt on” accessories that can add additional functionality such as graphics and media services, records management services, web publishing and also workflow or case management tools.
We will discuss each of these services in further articles.