When you print a document your computer does not send it directly to the printer. In most cases the document is first sent to a print server (which may be running on your local machine). This server is a process that waits for incoming documents needing printing, assigns incoming documents to their proper printer queue, and sends them one at a time to the proper printer.
A print server may manage more than one printer, and may accept requests from many different computers via network connections. Its main job is to ensure that multiple documents are not sent to a single printer at the same time. The print server assigns an identification number to each document being printed, which allows users to check on the progress of the print job. This identification number may also be used to cancel print requests, or change their job priority. Some print servers maintain statistics on printer usage, and provide billing information for users who pay to print documents.
Printers are good examples of slow I/O devices, as it is not uncommon for a print job to take several minutes to complete. We cannot keep programs blocked waiting for their print jobs to complete, so printing is almost always done asynchronously (in the background). Print servers often store copies of the document to be printed in a working directory on disk. This technique is called "spooling". The advantages of using this technique are that it allows programs to request a document be printed, and then forget about it. The print server then assumes the responsibility of watching the printers and sending the next document to a printer when it becomes ready. Another advantage is that a user can continue to modify a document after it is queued for printing, without interfering with the printing of the original version.
- For this assignment you are going to write a print server (in C).
- You will not use actual printers, but will instead "print" documents by appending them to the end of a file on disk. Since real printers (as do all devices) appear to be files, your server will not be very different from an actual print server. Your server will support at least two simulated printers.
- Your print server will accept print requests over a network (using TCP/IP)
- This means you will need to write client programs which talk to the server
- A client will need to discover the server's IP address using multicasting
- In addition to the server, you will need to write two small utilities which talk to the server:The print server will sleep for 30 seconds per job, to simulate the time it takes for the printer to complete
- a print command (similar to lp) which will request a file be printed (possibly stdin)
- a status command which will
- locate (discover) the server on the network
- show the printers supported by the server
- show the contents of the queue associated with a particular printer
- allow the user to cancel a print job which has not completed
- display statistics (# bytes/pages printed for each job)
- New print quests will store a copy of the document to be printed in a spool directory, and print this document when the printer is available.
- Your server may optionally run a document through some sort of filter before printing it. This will be part of the request from the client.
We haven't covered this in class yet (like much of what this project will rely upon). I suggest reading ahead, and investigating these topics on your own with the intent of fixing things once we cover them in class. I think you'll get more out of the later lectures this way, since you'll have an idea of what you need to focus on.
Having said that, one of the problems the server will need to address is how to handle more than one task at a time. The server may need to simultaneously listen for new print requests, handle one or more new print requests just received, send a new document to a printer which has just finished its previous job, respond to one or more status requests.
That sounds like a lot of stuff. But each task is really pretty simple if we looked at it separately.
We have already seen one technique for performing concurrent (at the same time) tasks: forking a new process. For this assignment we'll use a better technique: POSIX threads. Most of the problems we are faced with will be relatively easy to solve using these threads. Consider that the tasks listed above will need to communicate with each other, and that multiple concurrent tasks will need to access a print queue. How do we keep these from interfering with each other? POSIX threads provides mechanisms to share memory among different tasks (threads) and to prevent more than one thread from accessing the same area of memory at a time. I have allocated two class sessions to discuss POSIX threads.
This is an important part of the assignment, and we have a large block of class time to cover this topic. You will communicate with your server over a network using sockets and TCP/IP. These sockets act like files, so in a sense it will be like writing to a file and having another process read the data out. Most of your work will be to set up the network
connection, and to know when new data is available.
You'll also use multicasting to allow the client programs to locate their server. In a sense you will send a query "is my print server out there" out to the network, and if a machine running your server sees the request it will respond. Once the client receives the response it will have enough information to establish a normal TCP/IP session. (This will be an important feature since it is likely that more than one of you will be working on the same CSE machine at the same time).
As the name suggests, a queue is an important data structure for this project. What will be stored in the queue? Some sort of structure which contains information on a print request. For example the document name, the user making the request, the identification number, the filename in the spool directory.
Q: Why is the filename included, couldn't we just use the document name?
A: What if a user printed two different documents with the same name? Or two different versions of the same file?
Q: Where do the identification numbers come from?
A: The server creates them. They just need to be unique numbers. A counter would work fine.
Q: How do I know how many pages are in a document?
A: We approximate this based on the number of lines. Assume 60 lines per page. One line per linefeed in the document.
Q: How wide is a page?
A: It doesn't matter for our purposes. We can assume that the printer truncates lines that are too wide (as some older printers did)
Q: How do the client programs talk to the server?
A: They send a multicast request for their server, and use the response to learn the contact information (address and port)
Q: How does the server know which printers to use?
A: Have it read a configuration file when it starts. Each printer should have a one-word unique name.
Q: How many queue data structures will I need?
A: One per printer.
Q: What is a filter script?
A: A shell script that acts as a filter. The script is started with the document as its stdin. The script writes a modified document on its stdout. The server prints this.