Authentication and Authorization

From DLXS Documentation

Revision as of 16:51, 24 August 2007 by Pfarber (Talk | contribs)
Jump to: navigation, search

Main Page > Authentication and Authorization

Contents

Definitions

It is important to clarify the meanings of the terms authentication and authorization in the following discussion.

Authentication
is the process by which the identity of a user trying to access resources is established.
Authorization
is the process by which the system establishes the resources an authenticated use is entitled to access. So, for instance, a user may be authenticated but authorized to see some resources and not others, depending upon the institution they belong to.

Overview

The system uses three variables in the web server (CGI) environment for authentication and authorization operations.

The variables which determine the collections the current user is authorized to access are:

AUTHZD_COLL

a colon-separated list of collection identifiers that the current user is explicitly authorized to access; i.e., authorized collections

PUBLIC_COLL

a colon-separated list of collection identifiers that the current user may access without explicit authorization; i.e., public collections

and the variable that contains the username of the current authenticated user (for middleware such as the Collection Manager that requires you to log in) is:

REMOTE_USER

the username of the current authenticated user (if the user has logged in)

To set up authentication and authorization in DLXS, mechanisms must be put in place to set the values of these variables according to your requirements. The following sections contain several examples of different methods to do this.

Important note: the collection identifier lists in AUTHZD_COLL and PUBLIC_COLL must begin and end with a colon, and be colon-separated.

Setting the List(s) of Accessible/Authorized Collections

Using Static Settings

Probably the simplest way to set the list of accessible/authorized collections is to statically set the just the PUBLIC_COLL environment variable in the web server virtual host configuration(s). The advantage of this approach is that it is easy and fast; the disadvantage is that it is not very flexible: every user accessing the DLXS server will have the same access permissions. This approach works particularly well for a server which hosts only public collections, since hosting non-public collections generally entails allowing access to some users and not to others. For more information on setting static environment variables with the Apache web server, consult the documentation for the SetEnv configuration directive at the Apache server home page.

The DLXS installation process creates a partial Apache configuration file that uses static settings as an example for you to work from. For more information about this example file, see the Example Apache config sample files documentation.

Using a Custom (Dynamic) Authorization System to Set the Collections Lists

If you require different users to have different lists of authorized resources, then you will need to put a more powerful mechanism in place to dynamically set the values of AUTHZD_COLL and/or PUBLIC_COLL based on the IP address of the user's workstation, domain name, or some other method of authentication. Depending on your requirements, this will probably involve interaction with a campus-wide authentication system combined with a database associating authenticated users with lists of authorized resources.

At DLPS, the environment variables above are dynamically set for use by the DLXS system by an Apache module that queries an Oracle database (for more information on this system, see DLXS Authentication and Authorization System documentation) [LINK].

Setting Up Authentication For Use With DLXS

Using Basic Authentication

For sites with simple authentication requirements (e.g., if you just need to control several users' access to the Collection Manager), we recommend using standard HTTP Basic Authentication. Basic Authentication will ask users to enter a username and password for access to the directories you specify; after a user successfully authenticates, the environment variable REMOTE_USER will be set to the user's username, and then can be used by the DLXS system. For more information on configuring Basic Authentication with the Apache web server, consult the documentation at the Apache server home page.

The DLXS installation process creates a partial Apache configuration file that uses Basic Authentication as an example for you to work from. For more information about this example file, see the Apache config sample files documentation.

Using a Custom Authentication System

Any authentication mechanism that sets the REMOTE_USER environment variable (which, by the way, is conventional for all properly-written web authentication systems) will work with DLXS. There are myriad available systems, varying mainly in the specific database or file method used to store usernames and passwords. For more information on authentication modules available for the Apache web server, see the Apache Module Registry.

An Example of a Lightweight, Semi-secure Authentication and Authorization System

If it will suffice to give your users additional access to a fixed list of restricted resources limited by either or both of:

  • the IP address of the user's workstation
  • the user's username in a static list you maintain by hand

then the following setup may meet your needs. It is a mechanism to dynamically set the value of AUTHZD_COLL based on the IP address of the user's workstation or on a user's username (which you maintain together with a password in an Apache htpasswd file). The mechanism uses HTTP Basic Authentication and Apache web server configuration to set AUTHZD_COLL and optionally REMOTE_USER, should the user need to login from outside the IP address range.

The first component of the mechanism consists of two Apache web server Directory configurations.

   <Directory "/usr/local/dlxs/cgi"> 
     # 
     # manually set DLXS authorization 
     #
     # public 
     SetEnv PUBLIC_COLL=:MMM:NNN:
     # non-public
     # IP address implies authorization 
     SetEnvIf Remote_Addr "^AAA[.]BBB[.]" dlxs_authzd=true
     # authentication implies authorization 
     SetEnvIf Cookie "session=4g28dh5bfh" dlxs_authzd=true
     SetEnvIf dlxs_authzd "true" AUTHZD_COLL=:XXX:YYY: 
   </Directory>
   <Directory "/usr/local/dlxs/cgi/login"> 
     AuthName "State University Digital Library Collections" 
     AuthType Basic
     AuthUserFile conf/htpasswd.dlxs 
     Require valid-user
     Satisfy all 
   </Directory>

Make these edits:

  • change AAA and BBB to be the first to component of you institution's IP address range
  • change XXX, YYY, etc. to the collection identifiers of the collections that have restricted access
  • change MMM, NNN, etc. to the collection identifiers of the collections that are publicly available
  • change /usr/local/dlxs to your $DLXSROOT value
  • change "State University Digital Library Collections" so the name of your institution will appear in the Basic Authentication dialog box

Add the usernames and passwords to the htpasswd.dlxs file as explained in the Apache config sample files documentation

The second component of the mechanism is a small shell script to provide a mechanism to set a cookie and redirect to the user's original URL after successful Basic Authentication.

   #! /bin/sh
   DOMAIN=`echo $SERVER_NAME | cut -d. -f2-`
   echo "Content-type: text/html" 
   echo "Set-Cookie: session=4g28dh5bfh; domain=$DOMAIN" 
   echo ""
   echo "<html>" 
   echo "<meta http-equiv=\"Refresh\" content=\"1; URL=$QUERY_STRING\">" 
   echo "</html>"

Finally in $DLXSROOT/lib/LibGlobals.cfg, set $gLoginUrl to

   http://{your server name}/cgi/login?

DLXS will add the current URL after the question mark (?) in the $gLoginUrl and the login script will redirect to that URL after setting the cookie named session. The redirect to the original URL will re-run DLXS, now with the cookie set, AUTHZD_COLL will be set by the webserver which will see the cookie and the user will have access to the collections in AUTHZD_COLL.

This will give you a workable, albeit weak security system. In summary, here's how this mechanism works:

  • All users (public) will access the middleware and have access to the collections listed in PUBLIC_COLL. Those within your IP address range will also have access to the restricted collections.
  • The middleware will present a login link when appropriate. This link will take users to the tiny login script above. The Basic Authentication will control access to this script according to the entries you make in the htpasswd.dlxs file. Users that get in successfully will have a mysterious-looking cookie set for them and then return via <meta http-equiv="Refresh" ...> to the middleware.
  • Upon return to the middleware, with the cookie set properly, AUTHZD_COLL will be populated and the user will thereby have access to the restricted collections in addition to the public ones. This authorization will last until the browser is closed (and the cookie thus discarded).

The obvious weakness here is that anyone capable of setting the cookie properly can gain access to the restricted collections. It is disguised to look like a session identifier, but it is only a disguise; any value will work.

Authentication and Authorization (AUTHNZ) Extension Module

DLXS implements access to the AUTHZD_COLL and PUBLIC_COLL environment variables described above via a perl base class called AuthNZ defined by lib/AuthNZ.pm.

An extension to this base class module was implemented to support Athens authentication and, with programming, extensions to other like authentication and authorization systems can be created. Following is a brief description of the configuration to extend this module.

Looking at, for example DLXSROOT/web/t/text/home.xsl, observe that the href for the login link comes from the <ReauthLink> XML element. You can see this as XML by viewing your home page with "debug=xml" added to your URL. Set the global variable called $gAuthenticationEnabled in DLXSROOT/lib/LibGlobals.pm to q{1} (use the curly brace syntax).

In DLXSROOT/lib/AuthNZ.cfg, supply the values to your local authentication URLs in the AUTHNZ_libum section. <ReauthLink> XML will now be populated with those URL hrefs and you'll have a login link.

Note that once the user clicks on the login link they are in your local authentication system, not DLXS. It is up to your local authentication system to set the REMOTE_USER and AUTHZD_COLL and PUBLIC_COLL environment variables upon successful authentication and then redirect back to DLXS. This is the way DLXS interfaces to an authentication system.

In addition to access to these variables (which constitute the default communication with the AUTHNZ system) AuthNZ can be subclassed to interface to other AUTHNZ systems such as Athens or Shiboleth. AuthNZ.pm defines an object-oriented class that implements methods to:

  • Recognize one of several possible appropriate authentication and authorization modules
  • Load that module dynamically
  • Determine whether a requested resource is authorized by that module
  • Create side-effects in the AUTHZD_COLL and PUBLIC_COLL and REMOTE_USER environment variables and store information in the session object (dso) to record this authorization
  • Provide services supporting the creation of login and logout URLs

The AuthNZ class instantiation as an object (anzo) encapsulates and records the results of the authentication and authorization process, This object is attached to the session object (dso) for later reference.

The public interface to this module consists of the following routines.

  • HandleAuthNandAuthZ
  • BuildLoginOptionsPageURL
  • HandleSpecificLoginUrlPI
  • BuildSpecificLoginURL
  • HandleGeneralLoginUrlPI
  • GetGeneralLoginUrl
  • GetCollidLoginURL


Top

Personal tools