Reclaim Disk Space in TerpConnect with UNIX


General information on TerpConnect account quotas

The Division of Information Technology TerpConnect system (including GRACE) has a limit on the amount of disk space, or quota, that you can use for storing files. The current default quota on TerpConnect is 5 gigabyte (GB). If you come close to or exceed your allotted quota, it can cause various things to happen.


How a disk quota problem can manifest itself

If you are near or over your disk quota, it can manifest itself in different ways, depending upon how you access the system. If you only log in to the TerpConnect lab Windows machines; it might be as obvious as a warning message in a pop-up window when you log on or off. If you log in to the UNIX environment via a telnet/ssh connection, you may see a warning message at login time (more on this later).

You will need to log in to the TerpConnect system via a telnet/ssh client to issue the commands listed below.


Determining How Much Disk Quota is Being Used and Where Space Is Being Used

If you log in to the TerpConnect system via a telnet/SSH client and you are within 85% of your disk quota, you will see a warning when you login:

Warning: Disk usage at 95%; quota=50485760K, used=50485700K.

The most likely reasons are files left over from compilations or previous editing sessions.

First, make sure you know what your quota is by logging onto TerpConnect (via telnet/ssh); and at the system prompt type:

quota

The quota command will show how much of the 5 gigabytes (GB) quota is being used. The output will show a numeric tally and a percentage in kilobytes (KB):

y> quota      
Volume Name
Quota Used % Used Partition
user.USERID  
5000000   
4900000  
98%

 The quota tool shows space in kilobytes (KB). For example, the 5 GB quota will be shown as 5000000 KB.


Finding and Removing Browser Cache Files

In many cases, when file space is used up by Firefox or Internet Explorer browser cache files, the locally-stored text and images of recently visited pages can take up several megabytes of disk space. These files are often the culprit in over- (or near-) quota situations. You have two options on clearing these leftover cache files: you can manually delete them while in the browser (which must be done for each browser you use) or issue a command in the UNIX environment to delete them.

The TerpConnect systems have a script to remove leftover browser cache files:

clearcache

This will remove the cached files and clear up some space. The command will issue the quota command before and after attempting to clear cache files; therefore, you can see how much space (if any) was reclaimed. It will also list the directories in which it found browser cache files to delete.

By modifying the preferences of the browser and setting the size limit for the browser cache to zero, you can limit the amount of disk space taken up by browser cache files. Be aware that the lower you set the cache size, the fewer files can be stored locally. This means that more pages, images, etc. will have to be loaded from the remote host rather than from your local cache, so the rendering of some web pages will be slower.


Files Removed Did Not Give Enough Space

If clearing your browser cache does not reclaim enough space, you will need to do a little sleuthing to determine where the space is being used. First, change directory ('cd') to the top level directory of your account (which is NOT your home directory):

y> cd  /users/USERID

The disk usage (du) command shows the size of each file or directory in the current directory:

y> du -sk * | sort -n

The "s" option returns a summary of a directory's contents (rather than a full listing), and the "k" option shows the output in 1 KB increments (rather than the default 512 byte increments). The sort command lists them in ascending size order, with the largest directory last.

This will show which directory is using your space, "home," "pub," or any give folder. (Ignore the "backup" directory; that is for the nightly backup snapshot and does not impact your disk usage.)

y> du -sk * | sort -n

6234 random folder
6018 pub
182819   
home
95045 backup


In this example, the home directory is where most of the space is being used.

Directories and Files

Once you have determined which directory is using the most space, change into that directory and run the 'du' command again, in a recursive manner, until you determine where the space is being utilized:

y> cd directory
y> du -sk *
.??* | sort -n | tail

This will list all of your files (including "dot files" like ".login", ".cshrc", etc.), again in increasing size order, the largest being the last listed. In the example above, the output of the du command is first sorted, then piped through the tail filter to show only the last (largest) ten files/directories. You can then examine this list to determine where the space is being used and decide how to deal with different files. File sizes are listed in kilobytes (1000 character chunks), so a file with a number of "1000" would be 1000 kilobytes, or one megabyte in size:

y> du -sk * .??* | sort -n | tail

1059 Trash
1126 .matlab
1353 mozilla.test.dirs.gz
1782 fy06.doc
2410 jh-doc.mail
33433 ns_imap
25213 .mozilla

Some directories you may see are ".mozilla" (Firefox configurations and working directory), ".2kprofile" (TerpConnect Windows configurations and working directory) and "Library" (TerpConnect Macintosh configurations and working directory). In this example, the ".2kprofile" directory is taking up 152.3 MB, so that is the best place to look for unwanted space usage. Change (cd) into this directory and run the du command again, repeating the process until you find the largest file(s) and/or directory(ies).


Dealing with files that are taking space

Once you determine where space is being used, you will then need to decide what to do to free some space. Your main choices are to delete the file(s), move them off the account to alternate storage, or compress the contents to take less space.

To delete (remove) a file you no longer need, use the 'rm' command:

y> rm  filename

You can use a Secure File Transfer Program (sftp, scp) to copy the file elsewhere once you are sure you have a good copy on the remote host and then remove the local copy.

You can also burn the large file(s) to a CD on TerpConnect Personal Computers (PCs) and Macintosh computers. Once you verify that you can retrieve the file from the CD, you can then delete it from your TerpConnect account.

You can use the GNU file compression utility 'gzip' to pack the file into a smaller space:

rac2> gzip  filename

This will create filename.gz that can be up to 1/10th the size of the original. You can also examine the contents of this file with the 'zcat' command, as long as it is a text (not binary) file:

zcat  filename

You can uncompress the file back to its original state by typing:

rac2> gunzip  filename

Hopefully, the above steps will clear enough disk space to get things working smoothly again.


For assistance, contact the Call Center.