1、软件工程专业外文文献翻译物理数据库设计外文原文:A HISTORICAL PERSPECTIVEFrom the earliest days of computers, storing and manipulating data have been a major application focus. The first general-purpose DBMS was designed by Charles Bachman at General Electric in the early 1960s and was called the Integrated Data Store. It f
2、ormed the basis for the network data model, which was standardized by the Conference on Data Systems Languages (CODASYL) and strongly influenced database systems through the 1960s. Bachman was the first recipient of ACMs Turing Award (the computer science equivalent of a Nobel prize) for work in the
3、 database area; he received the award in 1973. In the late 1960s, IBM developed the Information Management System (IMS) DBMS, used even today in many major installations. IMS formed the basis for an alternative data representation framework called the hierarchical data model. The SABRE system for ma
4、king airline reservations was jointly developed by American Airlines and IBM around the same time, and it allowed several people to access the same data through computer network. Interestingly, today the same SABRE system is used to power popular Web-based travel services such as Travelocity!In 1970
5、, Edgar Codd, at IBMs San Jose Research Laboratory, proposed a new data representation framework called the relational data model. This proved to be a watershed in the development of database systems: it sparked rapid development of several DBMSs based on the relational model, along with a rich body
6、 of theoretical results that placed the field on a firm foundation. Codd won the 1981 Turing Award for his seminal work. Database systems matured as an academic discipline, and the popularity of relational DBMSs changed the commercial landscape. Their benefits were widely recognized, and the use of
7、DBMSs for managing corporate data became standard practice.In the 1980s, the relational model consolidated its position as the dominant DBMS paradigm, and database systems continued to gain widespread use. The SQL query language for relational databases, developed as part of IBMs System R project, i
8、s now the standard query language. SQL was standardized in the late 1980s, and the current standard, SQL-92, was adopted by the American National Standards Institute (ANSI) and International Standards Organization (ISO). Arguably, the most widely used form of concurrent programming is the concurrent
9、 execution of database programs (called transactions). Users write programs as if they are to be run by themselves, and the responsibility for running them concurrently is given to the DBMS. James Gray won the 1999 Turing award for his contributions to the field of transaction management in a DBMS.I
10、n the late 1980s and the 1990s, advances have been made in many areas of database systems. Considerable research has been carried out into more powerful query languages and richer data models, and there has been a big emphasis on supporting complex analysis of data from all parts of an enterprise. S
11、everal vendors (e.g., IBMs DB2, Oracle 8, Informix UDS) have extended their systems with the ability to store new data types such as images and text, and with the ability to ask more complex queries. Specialized systems have been developed by numerous vendors for creating data warehouses, consolidat
12、ing data from several databases, and for carrying out specialized analysis.An interesting phenomenon is the emergence of several enterprise resource planning(ERP) and management resource planning (MRP) packages, which add a substantial layer of application-oriented features on top of a DBMS. Widely
13、used packages include systems from Baan, Oracle, PeopleSoft, SAP, and Siebel. These packages identify a set of common tasks (e.g., inventory management, human resources planning, financial analysis) encountered by a large number of organizations and provide a general application layer to carry out t
14、hese tasks. The data is stored in a relational DBMS, and the application layer can be customized to different companies, leading to lower Introduction to Database Systems overall costs for the companies, compared to the cost of building the application layer from scratch. Most significantly, perhaps
15、, DBMSs have entered the Internet Age. While the first generation of Web sites stored their data exclusively in operating systems files, the use of a DBMS to store data that is accessed through a Web browser is becoming widespread. Queries are generated through Web-accessible forms and answers are f
16、ormatted using a markup language such as HTML, in order to be easily displayed in a browser. All the database vendors are adding features to their DBMS aimed at making it more suitable for deployment over the Internet. Database management continues to gain importance as more and more data is brought
17、 on-line, and made ever more accessible through computer networking. Today the field is being driven by exciting visions such as multimedia databases, interactive video, digital libraries, a host of scientific projects such as the human genome mapping effort and NASAs Earth Observation System projec
18、t, and the desire of companies to consolidate their decision-making processes and mine their data repositories for useful information about their businesses. Commercially, database manage- ment systems represent one of the largest and most vigorous market segments. Thusthes- tudy of database systems
19、 could prove to be richly rewarding in more ways than one!INTRODUCTION TO PHYSICAL DATABASE DESIGNLike all other aspects of database design, physical design must be guided by the nature of the data and its intended use. In particular, it is important to understand the typical workload that the datab
20、ase must support; the workload consists of a mix of queries and updates. Users also have certain requirements about how fast certain queries or updates must run or how many transactions must be processed per second. The workload description and users performance requirements are the basis on which a
21、 number of decisions have to be made during physical database design.To create a good physical database design and to tune the system for performance in response to evolving user requirements, the designer needs to understand the workings of a DBMS, especially the indexing and query processing techn
22、iques supported by the DBMS. If the database is expected to be accessed concurrently by many users, or is a distributed database, the task becomes more complicated, and other features of a DBMS come into play. DATABASE WORKLOADSThe key to good physical design is arriving at an accurate description o
23、f the expected workload. A workload description includes the following elements: 1. A list of queries and their frequencies, as a fraction of all queries and updates. 2. A list of updates and their frequencies. 3. Performance goals for each type of query and update.For each query in the workload, we
24、 must identify:Which relations are accessed.Which attributes are retained (in the SELECT clause).Which attributes have selection or join conditions expressed on them (in the WHERE clause) and how selective these conditions are likely to be. Similarly, for each update in the workload, we must identif
25、y:Which attributes have selection or join conditions expressed on them (in the WHERE clause) and how selective these conditions are likely to be.The type of update (INSERT, DELETE, or UPDATE) and the updated relation.For UPDATE commands, the fields that are modified by the update.Remember that queri
26、es and updates typically have parameters, for example, a debit or credit operation involves a particular account number. The values of these parameters determine selectivity of selection and join conditions.Updates have a query component that is used to find the target tuples. This component can ben
27、efit from a good physical design and the presence of indexes. On the other hand, updates typically require additional work to maintain indexes on the attributes that they modify. Thus, while queries can only benefit from the presence of an index, an index may either speed up or slow down a given upd
28、ate. Designers should keep this trade-offer in mind when creating indexes.NEED FOR DATABASE TUNINGAccurate, detailed workload information may be hard to come by while doing the initial design of the system. Consequently, tuning a database after it has been designed and deployed is importantwe must r
29、efine the initial design in the light of actual usage patterns to obtain the best possible performance.The distinction between database design and database tuning is somewhat arbitrary.We could consider the design process to be over once an initial conceptual schema is designed and a set of indexing
30、 and clustering decisions is made. Any subsequent changes to the conceptual schema or the indexes, say, would then be regarded as a tuning activity. Alternatively, we could consider some refinement of the conceptual schema (and physical design decisions affected by this refinement) to be part of the
31、 physical design process.Where we draw the line between design and tuning is not very important.OVERVIEW OF DATABASE TUNINGAfter the initial phase of database design, actual use of the database provides a valuable source of detailed information that can be used to refine the initial design. Many of
32、the original assumptions about the expected workload can be replaced by observed usage patterns; in general, some of the initial workload specification will be validated, and some of it will turn out to be wrong. Initial guesses about the size of data can be replaced with actual statistics from the
33、system catalogs (although this information will keep changing as the system evolves). Careful monitoring of queries can reveal unexpected problems; for example, the optimizer may not be using some indexes as intended to produce good plans.Continued database tuning is important to get the best possible performance. TUNING THE
copyright@ 2008-2022 冰豆网网站版权所有
经营许可证编号:鄂ICP备2022015515号-1