"Database design" is the term that is commonly used to refer to a wide variety of functions that are associated with database generation within organizations that are involved in the electronic publishing of data or information collections that are intended for search and retrieval or other manipulation by computer. Database design can be interpreted narrowly, broadly, or anywhere in between. This entry uses a broad interpretation; it assumes that design is seldom a one-person effort and may require a team of experts in order to provide the necessary technical, management, human resources, financial, and subject expertise. Design is not a one-time activity even though many people think of it as occurring only prior to the creation of a database. In a broad sense, database design involves the continued improvement of database products. The principal designer or leader of the design team is unlikely to be an expert in all aspects of design, so he or she must call on and interact with others to accomplish a good, flexible, expandable design.
Database design involves the organization and presentation of data (where the term "data" is understood to refer to data or information or facts of any type) so it can be easily located, accessed, and used. It includes a wide variety of functions and activities that range from selection and acquisition of raw, reduced, or otherwise processed data to a variety of value-adding activities.
Design involves specification of criteria for limiting the type of content by subject matter (i.e., its breadth and depth—often referred to as "horizontal" and "vertical," respectively—in connection with an information system), language, type of data (e.g., abstracts, full text, numeric data, images, audio data), geographic location, file size, and so on. In general, database design covers all aspects of what needs to be accomplished both manually and by computer in preparing, processing, and maintaining a database. Design must include documentation of all of the parameters and activities discussed in this entry.
Content and Value Adding
Database content is roughly determined by the visionary who proposed the product and by amplifications that are made by a design team of experts who will have checked appropriate reference sources and experts to gain further understanding of the user needs and sources for the data. Details will depend on the subject matter or functions that will be served by the proposed database. The methods for organizing and indexing (or otherwise making access points to the data) will dictate the type of format that is appropriate and the standards that should be considered. Content is the most important element of the database, but next to that is the added value that may determine the uses to which the data can be put and the attractiveness of the database.
Adding value applies to virtually any aspect of the database. It applies to basic production processes as well as enhancements that improve, for example, the content, accessibility, appearance, usability, of databases. Value-adding activities include the following:
- reducing data (where needed),
- formatting data in accordance with standards,
- enhancing, expanding, merging with other data or data records,
- categorizing, classifying, indexing, abstracting, tagging, flagging, and coding to improve accessibility,
- sorting, arranging or rearranging, putting the information into one or more forms that will satisfy users,
- creating visual representations of data (especially for numeric data),
- updating, correcting errors,
- adhering to production schedules, and
- putting the data into searchable form with appropriate access points (and links to other databases and application packages) for search, retrieval, manipulation, and use by users.
These and many more activities are all considered to be a part of adding value to the product.
Computers and Information Technology
Database design includes the recommendation and eventual selection of appropriate information technology for processing, storing, and manipulating data, as well as the selection of the media for processing and storing data on site and for distribution of data to customers. The database may be produced in several different media (e.g., CDROM, DVD, diskettes, hand-held devices, or any new technology that may appear in the marketplace), depending on the type of product that is needed by customers. Any organization that produces databases must remain alert to the development of new technologies that may be of use to the organization for (1) processing, storing, distributing, and using their data, (2) management of the various functions, or (3) generation of reports. Computer selection is important because of its centrality to the entire process of database production and use.
Software for Processing, Management, and Search/Retrieval
Database design includes the development of, or acquisition of, software for managing the flow of data and records through all steps of the process—from acquisition and processing of data, to generating management reports about the data flowing through the system, to delivery to customers or making data accessible to users online. Software is required for search/retrieval processes, as well as for manipulation of the data (e.g., using spreadsheets, sorting, rearranging, running statistical programs on data) and the generation of reports in compliance with user requests.
In general, a database design is created for a master database from which a variety of products can be produced. The products can have exactly the same content, and they can be produced in the same format or in differing formats. The products created from the master database can be made available on different media for distribution. The master database can also be used to create subsets of the same data to meet specific customer needs. Subsets of the data can be merged with outside data to create new products, or the records in the master database may have additional data elements added in order to increase the value for specific customers.
Quality Assessment and Control
When designing a database, quality is an important consideration. The designer or design team must indicate the areas where quality should be monitored, determine how it can be monitored, and establish methods for controlling quality. The quality of a database product involves many different aspects of the database, such as reliability of the medium for distribution/access, accuracy of data, timeliness, inclusion of essential data elements in every record, and additional elements that may or may not be present because of the variability of data sources. Customers and users judge databases according to many objective and subjective factors. Accuracy of data, clarity of presentation and ease of using the front end to the information system, and adherence to time schedules are a few of the objective factors. The subjective factors include such things as acceptability of price in relation to the user's budget and how good a match there is between the user's need and the database design.
Market Analysis, Pricing, and Marketing
Designers must determine the target market for a given database product in terms of potential size, characteristics, geographic location, language, spending limits, needs/wants for data, and current information use patterns. Much of this would depend on a "user needs assessment" that should be done prior to the design phase or as the first step of the design process. Designers must analyze the potential market, review the competition, compare both the positive and negative aspects of competitor products with the planned database product, estimate prices for types of lease, license, and online use of the database, estimate likely levels of sales from a reasonable fraction of the market, and estimate the design, production, and operating costs. These various types of market data become a part of the business plan for the database. The designers must consider methodology for marketing the database, including the use of site visits, telephone contacts, attendance at conferences/meetings, advertising, and websites that have two-way links to appropriate databases.
When a new database is designed, the design documentation or proposal for creating the database should include a business plan as a part of the general design. Producer organizations normally require detailed cost data for any new product. Three-year or five-year projections (sometimes more) of costs for selected methods for marketing, along with identification of the named sources, targets, and links, should be included in the text description and in the spreadsheets of the business plan. The plan must also include detailed discussion of all aspects of the database, including source material (where will the data come from and what will it cost) required for the data, quality assessment and control, analysis, and all planned aspects of adding value.
The total cost information that is provided for the business plan should include the estimated cost of designing and testing a sample database and the cost of gearing up for production, as well as revenue projections for a specified number of years. By using a computer-generated spreadsheet application (e.g., Excel or Lotus), it is easy to make adjustments in pricing—for example, to see how the adjustments affect the projected costs to the producer and the projected revenue from the customers.
The design of a database must take into account the legal problems that are associated with the intellectual property rights, copyrights, and use rights that belong to the author of the original data contained in the database; the rights of the users to have open access to information; the rights and liabilities of the database producer who makes a collection of data available to a wide audience; and the responsibilities that the producer has to the sources of information as well as to the customers who use the database. These aspects, plus financial consideration and pricing, are put into legal contracts (prepared by an attorney for the producer) that are executed between the database producer organization and the data sources and between the producer and the customers. In large organizations, there is generally an attorney to handle such matters, and in small organizations, a consulting attorney is employed to develop contracts. The design team should have a knowledge of where the problems lie and should be able to convey the necessary information to the attorney.
See also:Cataloging and Knowledge Organization; Computer Software; Computing; Databases, Electronic; Knowledge Management; Libraries, Digital; Library Automation; Management Information Systems; Retrieval of Information; Systems Designers.
Arnold, Stephen E. (1989). "Online Pricing: Where It's at Today and Where It's Going Tomorrow." Online 13 (March):6-9.
Berghel, Hal. (1996). "HTML Compliance and the Return of the Test Pattern." Communications of the ACM 39(2):19-22.
Jacsó, Péter. (1997). "Content Evaluation of Databases." In Annual Review of Information Science and Technology, Vol. 32, ed. Martha E. Williams. Medford, NJ: Information Today, Inc.
Lipinski, Tomas A. (1998). "Information Ownership and Control." In Annual Review of Information Science and Technology, Vol. 33, ed. Martha E. Williams. Medford, NJ: Information Today, Inc.
Mason, Richard O.; Mason, Florence M.; and Culnan, Mary J. (1995). Ethics of Information Management. Thousand Oaks, CA: Sage Publications.
McGarty, Terrence P. (1989). Business Plans that Win Venture Capital. New York: Wiley.
Mirchin, David. (1999). "Protecting and Using Intellectual Property on the Internet: Exploding the Myth." In Proceedings of the Twentieth National Online Meeting, ed. Martha E. Williams. Medford, NJ: Information Today, Inc.
Oppenheim, Charles. (1990). "Marketing of Real-Time and Bibliographic Databases." In Proceedings of the 14th International Online Meeting, London, 11-13 December 1990, ed. David I. Raitt. Oxford, Eng.: Learned Information, Ltd.
Tenopir, Carol. (1993). "Priorities of Quality." In Quality and Value of Information Services, ed. Reva Basch. London: Ashgate.
Tenopir, Carol. (1995). "Authors and Readers: The Keys to Success or Failure for Electronic Publishing." Library Trends 43:571-591.
Webber, Sheila. (1998). "Pricing and Marketing Online Information Services." In Annual Review of Information Science and Technology, Vol. 33, ed. Martha E. Williams. Medford, NJ: Information Today, Inc.
Williams, Martha E. (1994). "Implications of the Internet for the Information Industry and Database Providers." Online & CD-ROM Review 18(3): 149-156.
Martha E. Williams