DB notes
From openmichigan
[edit] Table/Column Specific
A ocw_material_categories - Why does this table exist? All "category" column values are set to "Material" in ocw_materials.
A ocw_claims_commission, ocw_claims_fairuse, ocw_claims_permission, ocw_claims_retain - All have the "action" column which is the same enum. Perhaps this can be normalized out into its own table
A ocw_users - The autoincrementing id column is an int, what if there are larger numbers of users. This table came from freakauth light and is currently an int. Can likely be changed. But should we keep using a numerical id value of should we use some sort of calculated hash or an email address?
K ocw_mimetypes - The id column is a tinyint which has a range of -128/127 and 0/255 if unsigned. This should be changed so we have headroom for more mimetypes.
K ocw_acl and ocw_user_profile - Are they related? If so, what is the relation?
K ocw_tags - How is ocw_tags currently used in the code? Is the current material type also using data similar to the ocw_tags entries?
A ocw_courses.collaborators - Using a single record for multiple collaborators is untidy and will result in errors. Use a pivot table to relate users entered into ocw_users and the collaborators column in the ocw_courses table.
A ocw_claims_permission - The contact info should be recorded in its own table instead of being within this table.
[edit] Overall design comments
K datetime vs. timestamp - The datatypes here are inconsistent. Some "created_on" columns are DATETIME type while others are TIMESTAMP. Similarly the "modified_on" columns also vary between DATETIME and TIMESTAMP types. Can DATETIME types by automatically updated within the DB with the current time? Or would this need to be done within application code if the data type was set to DATETIME instead of TIMESTAMP?
K id fields in tables - Should they all be bigints to simplify the schema? Is there advantage to selecting the largest one that we anticipate needing? Is there performance advantage there?
A ORM for OERca 2 - If an ORM is used, we should see if data types permitted are the lowest common denominator across all supported databases. Currently looking at SQL Alchemy which is a Python ORM.
A K Remove redundant models in current code - The current code contains multiple functions that do roughly the same things. Perhaps functionality can be merged into a single function that supports all the functionality required based on passed parameters. How much code will need to be changed in controllers that are calling the models? Will this be too much of a pain?
Suits Multiple institutions - How much work is required to allow multiple institutions to user OERca. Is it at all usable at present? If not, what needs to change?
K All user info in a single place - The ocw_users table should contain information on all users instead of having separate instructors and collaborators etc. Contact information as noted above should live in a separate table.
A All fields with Foreign Key constraints should have the same types as their parents - The current schema has several columns that are constrained by Foreign Keys but don't have the same data types as their parents. Since the data in the constrained column is constrained by the data in the parent column, having the same datatype on both makes sense.
A status values changed from enum to table - There are several tables that contain "status" columns that have the same permitted enum keys. Perhaps these can be put in a separate table. Adding different status values will become easier. Also, are these used to compute status or does the code look elsewhere? So is determining status a matter of looking it up on the DB, or computing it every time using other looked up values. Should this change?
K Multiple roles for single users - The DB currently permits a user to have only one role per course. Should the user be allowed to have multiple roles per course and should the application code allow them to choose which role they want to work in?
[edit] Further Questions
Suits Internationalized Contact Info - Address formats differ between nations. How do we define things in the DB to allow this. Research is required. Very likely a combination of generic fields that are presented in varying order based on the country for which the address is defined. Insert non-formatted text here