Artemis MP3 Search Engine System And Software Requirements Definition
Document last modified: 2000-02-18
Authors:
- Anders Pearson <anp8>
- David Masao Dodobara <dmd69>
- Guillermo M Ramos <gmr9>
- Karl McClare Questelles <kmq2>
- Miriam Shana Adlerstein <msa22>
Table of Contents
- Introduction
- Project Requirements
- System Requirements
- Glassary
1 Introduction
1.1 Purpose of SSRD
The purpose of the System and Software Requirements Definition is to
inform the customer of all the responsibilities, constraints and
limitations associated with the project. This document will help identify and
outline the requirements for the proposed system's performance, usability,
and future maintenance.
1.2 Reference
- Our Operational Concept Definition (OCD) was designed to provide
a general understanding of our project's goals from an external
perspective. This document was then utilized to help define the
internal specifics of our proposed system to produce the SSRD
- Hypertext Transfer Protocol -- HTTP/1.1. R. Fielding, J. Gettys,
J. Mogul, H. Frystyk, L. Masinter, P. Leach, T. Berners-Lee. June
1999. (Format: TXT=422317, PS=5529857 bytes)
http://info.internet.isi.edu:80/in-notes/rfc/files/rfc2616.ps
- "The WWW Common Gateway Interface Version 1.1", David Robinson, Ken
Coar, 09/21/1999,http://info.internet.isi.edu:80/0/in-drafts/files/draft-coar-cgi-v11-03.txt
- In our meeting and correspondence with Professor Nieh, the customer,
following the submission of the OCD, we set out to understand his
exact idea of the means and constraints with which we will accomplish
the previously outlined project goals.
2 Project Requirements
The following section will specify the project requirements and relate
them to the acceptability of our final product
2.1 Budget and Schedule Requirements
2.1.1 Budget Requirements
There are no budget requirements associated with Artemis.
2.1.2 Schedule Requirements
Project Requirement-01
| Title: | Schedule Constraint |
| Description: |
This project must be completed by the end of this
semester.
|
| Measurable: |
By determining how many of the requirements were fulfilled
by the end of the semester. |
| Achievable: |
With strict adherence to the project deadlines and
constant communication with our customer, we hope to complete the project
within the constraint, leaving considerable time for testing and
perfecting |
| Relevant: |
By not completing the project within the schedule time can
result in a failing grade by the professor. |
| Specific: |
Our customer needs the product by the end of this semester
or Guitar Notes will be forced to continue to use an inadequate search
engine. |
Project Requirement-02
| Title: | Enhance current search engine's spidering range |
| Description: | The developers must overhaul the current search engine to
broaden its searching capabilities in order to accommodate the needs of
our customer. The searches should extend to a variety of internet sites
to increase the likelihood of meeting the users' demands. |
| Measurable: | Running test queries before and after implementing
modifications in the search engine while closely monitoring the results. |
| Achievable: | Through extensive research on the part of the developers
as to the best choice for intermediate sites |
| Relevant: | The achievement of this major goal holds the key to the
passing grade mentioned in Project Goal-01, and covers the most pressing
modification called for by the shortcomings of the current system. |
| Specific: | By increasing searching capabilities of current search
engine the site will have satisfied web users. |
| Title: | Produce reliable search results |
| Description: | To ensure reliable search results from each query, meaning
the link that is returned contains the correct URL, the audio file
actually corresponds to the query, and the audio file still exists resides
at the associated URL (i.e., has not become a broken link.) |
| Measurable: | The number of successful audio files opened should
increase if all files delivered actually contain the MP3s they allegedly
link to.
|
| Achievable: | With the development of a scanning program that identifies
all dead links and deletes them.
|
| Relevant: | The quality of matches is at least as important an
enhancement to the system as the quantity of matches generated
|
| Specific: | Providing reliable links increases the popularity of
Guitar Notes as well as their creditability.
|
2.2 Development Requirements
2.2.1 Programming Languages Requirements
| Title: | Specified Programming Language == PERL |
| Description: | The customer has requested that all files be written in
Perl 5
|
| Measurable: | If the language is exclusively Perl, it should run on the
the Perl 5 interpreter
|
| Achievable: | Through intensive training in the Perl programming
language on the part of the developers
|
| Relevant: | The current system is written in Perl, thus allowing easy
integration of the new component that is to be our search engine
|
| Specific: | The use of any other programming languages, whether
embedded or used confluently, would result in a great deal of confusion
and the need for more complex documentation
|
2.2.2 Tools Requirements
There are no specific tools requirements for this project. The nature of
Artemis allows for flexibility in the development environment and
resources
2.2.3 Computer Hardware Requirements
Since this project is being implemented in Perl, which is an interpreted
language, there are no specific hardware requirements.
2.2.4 Computer Communication Requirements
| Title: | Compliance with CGI Protocol
|
|---|
| Description: | The Search engine must be able to communicate with the web
interface via the Common Gateway Interface (CGI)
|
| Measurable: | Artemis must be able to correctly process GET and POST
requests as dictated by the CGI
|
| Achievable: | Through careful study of the CGI particulars and
subsequent coding
|
| Relevant: | The search engine requires a means of communication with
the user
|
| Specific: | The nature of the project, a web search engine, calls for
the adoption of a universal means of accepting and returning data. CGI
will allow the user to submit a query, and subsequently allow the search
engine to communicate its findings
|
2.2.5 Computer Software Requirements
| Title: | FreeBSD Operating System
|
|---|
| Description: | The Artemis MP3 Search engine will run on the FreeBSD
Operating System
|
| Measurable: | Via testing of Artemis on the target operating system.
|
| Achievable: | This is achievable by doing the actual coding and
implementation directly on the FreeBSD operating system.
|
| Relevant: | The Guitar Notes server is currently running on the
FreeBSD operating system
|
| Specific: | Keeping the Artemis development compatible with the
current system will minimize the possible complications that could arise
during the integration of the search engine
|
| Title: | Plain Text Data Formatting
|
|---|
| Description: | Artemis must be able to read and write data in plaintext
format
|
| Measurable: | Testing of the search engine should yield legible results
|
| Achievable: | Our choice of programming language is ideal for the
manipulation of plaintext
|
| Relevant: | The Guitar Notes' current database is a collection of text
files. Our system will need to interact with these files
|
| Specific: | The search engine will be scanning this database. It will
also involve the annexing of URLs to these files.
|
2.2.6 Standards Compliance Requirements
| Title: | HTTP 1.1
|
|---|
| Description: | The transmission of data to the web interface must satisfy
the HTTP protocol
|
| Measurable: | If we neglect this, the return will generate a server
error. If it is properly complied with, the search results will be
displayed by the web browser
|
| Achievable: | With simple adherence to the HTTP 1.1 guidelines
|
| Relevant: | The search engine is web based.
|
| Specific: | It must communicate with the browser to successfully
transmit its findings
|
2.2.7 Evolution Requirements
The customer does not currently have any specific evolutionary ideas in
mind. Professor Nieh does require that in the event that he does
formulate an idea, it should be easy to implement by modifying our current
system. To that end, we will endeavor to keep the code modular, scalable,
and extraordinarily well documented.
2.3 Packaging Requirements
Professor Nieh will provide the team with a test directory on the Guitar
Notes server. After extensive testing has demonstrated the code's
effectiveness and reliability, Artemis will be uploaded to the guitarnotes.com web site by the team where
it will face the ultimate test: the random user. Prof. Nieh will oversee
integration of Artemis into the guitarnotes.com website. A compilation
of all technical documentation produced in parallel with the code will
also be delivered at the point of installation.
3 System Requirements
3.1 System Definition
The actual search engine comprises the primary portion of the
Artemis MP3 Search engine project. It will search through the local
database, then through the specified additional websites, all the while
attempting to match the user's query as closely as possible. The search
engine's principal accomplice is a continuous spider that traverses the
WWW in the background searching for audio files to add to the database.
The database will feature a verifier that iterates through the URLs stored
to verify that they are still valid and functioning, and removes all that
are not.
System Block Diagram
3.2 System Capability Requirements
Every one of the following capabilities has been deemed mandatory by the
customer. Each one is a priority and a necessary element in the
realization of the project's goals. None are dispensable.
| Title: | Internet searching application
|
|---|
| Description: | We, the developers, will create a new database of web
sites that are likely to contain up-to-date collections of audio files.
When the user enters a query, the search engine will no longer merely
spider the current Guitar Notes database. It will concurrently search,
using some artificially intelligent application, the internet sites listed
in the new database.
|
| Measurable: | By analysis of the quantitative improvement in search
findings.
|
| Achievable: | Through independent internet research on the part of the
developers to pinpoint the appropriate intermediate sites
|
| Relevant: | Users are more likely to find exactly what they are
looking for when the information from multiple internet resources are
pooled.
|
| Specific: | Users can expect both a greater number and an enhanced
diversity of search results.
|
| Title: | Periodic filtering of database |
| Description: | The database needs some system to scan the URLs at regular
intervals, removing any dead links detected. |
| Measurable: | Reduction in the number of dead files delivered to the
user could be measured either by merely soliciting feedback from the user
or by testing/sampling actual results. |
| Achievable: | With some simple code |
| Relevant: | This function would represent a vital step in the goal to
transmit reliable search results. |
| Specific: | Users tend to find dead links or misinformation very
frustrating and discouraging, and could develop negative feelings to the
Guitar Notes project in general if subjected to it too frequently.
|
| Title: | Delivery of results in specified format
|
|---|
| Description: | The customer requires that the MP3 file returned be
accompanied by the artist's name, song title, size of the file, nature of
audio file
|
| Measurable: | The testing of the search engine should yield results in
this format
|
| Achievable: | Using a system that extracts the aforementioned data from
the web site associated with the audio file
|
| Relevant: | The data will assist the user in identifying the relevance
of the returned file to his/her query.
|
| Specific: | In the event that the search has missed its mark in
certain returns, the user will avoid the frustration of having to open
irrelevant audio files.
|
| Title: | Continuous background searching application
|
|---|
| Description: | The search engine should feature a program that
automatically spiders the internet for audio files for random recordings,
24 hours a day, seven days a week, entirely unprompted. It's findings
would be automatically sorted and annexed to the Guitar Notes database.
|
| Measurable: | The very size of the database should increase
dramatically.
|
| Achievable: | Using code that retrieves web pages and scans its links
for audio files
|
| Relevant: | The task of expanding the database manually would consume
countless hours of site developers' time (and thereby interfere with the
aforementioned semester schedule restraint). Relying on submission of
audio file URLs by web users is not a sufficient route to
expansion-primarily because they are the intended beneficiaries, not the
providers, but also because their involvement is unpredictable and
erratic.
|
| Specific: | The very blindness of the method is likely to unearth
material/sites that a developer would never conceive of probing.
|
| Title: | Ongoing transmission of results
|
|---|
| Description: | The search engine should deliver matches as they are found
rather than forming a collection and delivering the final package. This
will infuse a sense of instant gratification in the user.
|
| Measurable: | Reduction in the impatience of the user could be observed
by monitoring the frequency of suspended searches, or by merely soliciting
feedback from the user.
|
| Achievable: | Using code that eliminates the intermediate buffer so that
the search results are flushed to the browser continuously
|
| Relevant: | The new and improved search engine will most likely take
more time than the original, since it will search a greater area. This
circumstance calls for greater caution in keeping the customer waiting for
an answer.
|
| Specific: | Users of search engines often do not require more than a
small number of matches that meet their criteria, and would consider their
search successful as soon as these were displayed. This method also
constructs the illusory impression that the search engine is operating at
a greater speed.
|
3.3 User Interface Requirements
3.3.1 Graphical User Interface Requirements
This project is not responsible for developing a graphical user interface.
The only interface of relevance is the guitar notes website, which already
exists.
3.3.2 Command-Line Interface Requirements
The search engine part of the project has no command line interface
requirements.
The spider will have the following command line interface requirements:
% artemis-spider[OPTION]... [URL]... [FILE]...
-s, --start [URL|FILE] (starts the spider with the specified
URL or file containing URLs)
-v, --verbose (verbose output)
-m, --max NUM (maximum number of pages to spider before stopping)
-V, --version (prints version number and exits)
The verifier will have the following command line interface requirements:
% artemis-verify [OPTION]... [FILE]...
-v, --verbose (verbose output)
-V, --version (prints version number and exits)
3.3.3 API Provision Requirements
Must comply with CGI and HTTP 1.1 protocols. see References section.
3.3.4 API Usage Requirements
Must comply with CGI and HTTP 1.1 protocols. see References section.
3.3.5 Diagnostics Requirements
There are no diagnostic requirements for this project.
3.4 Hardware and Communications Interface Requirements
There are no hardware communications interface requirements for this
project.
3.5 System Levels of Service Requirements
| Title: | Limbo Limit
|
|---|
| Description: | An upper limit on the response time for the search engine;
the maximum amount of time that the user should be made to wait before
being informed of the status of their search
|
| Measurable: | By monitoring the frequency of suspended searches, or by
merely soliciting feedback from the user.
|
| Achievable: | Implementing an internal timer that ensures that some form
of response
|
| Relevant: | The customer is likely to feel neglected after a certain
amount of time and will assume that no information is forthcoming.
|
| Specific: | Users would quickly lose patience with the search engine
as a whole if they were left without feedback on too many occasions. User
must be informed that no matches have been found, but that the search is
continuing.
|
| Title: | Verification Frequency
|
|---|
| Description: | We will need to specify the length of the interval between
database scans
|
| Measurable: | Dead links are removed from database after the specified
period has elapsed
|
| Achievable: | Via a piece of code that monitors the passing of time, and
calls the verifier method whenever the set time has passed
|
| Relevant: | To the measure of reliability of the database's URLs
|
| Specific: | Overly frequent scanning is unnecessary, whereas scanning
that is not done frequently enough will not satisfy the customer's
reliability requirement
|
| Title: | System Popularity
|
|---|
| Description: | The search engine will have to handle the case of multiple
users efficiently
|
| Measurable: | In testing the time discrepancies in individual searches
when other searches are being conducted simultaneously
|
| Achievable: | Meticulous file locking to protect the data integrity of
each isolated search
|
| Relevant: | We do not want any crossing over in the searches of
different users.
|
| Specific: | To our aim to produce accurate and reliable search results
|
4 Glossary
- API - Application Programming Interface
- This is an abstraction of code used to add modules
- CGI - Common Gateway Interface
- The Common Gateway Interface (CGI) is a simple interface for running
external programs, software or gateways under an information server in a
platform-independent manner. see references section.
- FreeBSD
- This is a free UNIX operating system for Intel architecture
computers
- HTTP - HyperText Transfer Protocol
- The Hypertext Transfer Protocol (HTTP) is an application-level protocol
for distributed, collaborative, hypermedia information systems. see references section.
- Interpreter
- A program that translates high level instructions to machine code
at runtime.
- PERL - Practical Extraction and Report Language
- A programming language written by Larry Wall that excels at
manipulation of textual data.
- Scalability
- The ability of the hardware and software used for limited user testing to
provide the same functionality to multiple internet users
html document maintained by Anders