Skip to content

Category: Maintanence

Hitting disk space limit? Near disk quota?

If you see funny behavior on your account, it might be that you are running out of disk space.
A good citizen might do the following every few months, but that ain’t me babe.

The command :   quota         will tell you how close you are to your quota.

If you are close (or over!!!), then it’s probably junk files doing it, which can safely be trashed.

But how to find the culprits?  How to separate the chaff from the wheat?

(See update at bottom for some of Bruce’s tips on how to find offending files,
but immediately below this line are Joel’s less sophisticated directions.)

du -ak | sort -n | tail   will give you a list of the biggest dozen or so files.  That, of
course, doesn’t necessarily mean they are trash.  But common places
with lots of trash are (in my experience)

<your home directory>.Mathematica/Paclets/Temporary      (note the dot before Mathematica!)                                      This often has enormous garbage there. Clear out any contents (but
probably not the directory “Temporary” itself.). I had the courage to byte
this bullet after reading
https://mathematica.stackexchange.com/questions/64130/mathematica-appdata-folder-is-taking-up-too-much-space
Note that the comment there also makes the reasonable statement that
the contents of any directory called Temporary or the like are fair game!

<your home directory>.mozilla/firefox will have one or more directories with gibberish names.
(note the dot before mozilla!)
If there is more than one, than trash all but the newest.

For me at least, even in the newest, I had a cache2 directory whose enormous contents I
trashed.  I also googled how to force firefox to automatically clear its cache
whenever firefox is logged out so that this garbage doesn’t accumulate!


Here are Bruce’s more sophisticated directions from an email on July 15, 2024:

I ran this report:
du -hx /var/share/home/curtina | grep -P ‘^[^\s]+M’

 

 

 

clear firefox cache cleanly to free up space and not overflow quota

A cleaner way to free up firefox cache space, was provided by Mike Tie via Bruce Duffy.

I noticed I couldnt log in to carleton email via astronet firefox and these guys said it was because of a carcass in cache. So this is the procedure for clearing cache, which can also often bring a user’s space down below their quota limit:

Firefox has cached the old info, and now you need to clear the firefox cache. To do that, Launch firefox, click the three bars in the upper right corner of the firefox window, click Preferences, click the lock icon (middle left of the window), scroll down to “Coookies and Site Data”, click on “Clear Data…”, make sure both items are checked, and click “Clear”, and then click “Clear Now”.

shutdown and reboot

One needs to be root or have sudo privileges to do this:

sudo shutdown -rv 2

-rv  2 means :

r is restart after shutdown;

v means verbose

2 means in 2 minutes

building python 2.7 & related stuff on astronet

Yuping:

references:

https://www.digitalocean.com/community/tutorials/how-to-set-up-python-2-7-6-and-3-3-3-on-centos-6-4

https://www.digitalocean.com/community/tutorials/common-python-tools-using-virtualenv-installing-with-pip-and-managing-packages

——————————————————–

I decided for now to install python 2.7 instead of python 3.0  because that’s what most astronomy python libraries are compatible with (e.g. psrchive). If someone wants to build python 3.0 just substitute the following 2.7.12 with 3.0.* and that should work. Note that I install it in /usr/local/ for each of the computers (i’m not sure what exactly willhappen if multiple hosts try to run the same python) so the following is repeated for each computer we have.

——————————————————

prereqs: make sure you do

sudo yum install tcl tcl-devel tk tk-devel

and check to make sure that all prereqs specified in

https://www.digitalocean.com/community/tutorials/common-python-tools-using-virtualenv-installing-with-pip-and-managing-packages

are satisfied.


install:

First download python from https://www.python.org/ and untar it

>xz -d Python-2.7.12.tar.xz

>tar -xvf Python-2.7.6.tar
>sudo cp -r Python-2.7.6 /usr/local/src/

now we can cd into /usr/local/src/Python-2.7.6 and run

>sudo ./configure --prefix=/usr/local
>sudo make

make will complain about not having the following

>Python build finished, but the necessary bits to build these modules were not found:
>bsddb185           dl                 imageop
>sunaudiodev

These four we can live without (see http://www.kelvinwong.ca/2010/08/02/python-2-7-on-dreamhost/) but if you are missing more than that, google it up.

Now do the following (altinstall is very important otherwise you are gonna break CentOS)

> sudo make altinstall

Now python should be built with executable under /usr/local/bin/python2.7


Now build pip which does package management and virtualenv which lets
you install python library without sudo-ing and makes complicated
dependencies easier to handle (see reference article 2)

 

First you need setuptool because pip depends on it

You can download it from https://pypi.python.org/packages/source/s/setuptools/, untar it and copy it over to /usr/local/src/

then do

> cd setuptools-1.4.2

> sudo /usr/local/bin/python2.7 setup.py install

then you download pip from https://bootstrap.pypa.io/get-pip.py into /usr/local/src/ and do

> sudo /usr/local/bin/python2.7 get-pip.py

and pip should be installed.

Finally you can do the following to get some common libraries installed:

> sudo /usr/local/bin/pip2.7 install numpy scipy matplotlib astropy ipython

you can do this on a single computer, go to some nfs mounted directory (e.g. your home directory) and do

> /usr/local/pip freeze > package_list.txt

which pipes the current installed packages to the file package_list.txt. Then you can hop (ssh) onto other computers and just do

> sudo /usr/local/bin/pip2.7 install -r package_list.txt

And done!

more to come on installing python libraries without being sudo.

Login Hangs After CentOS/RedHat Kernel Update

June 15, 2015

A second incident of login getting hung after kernel update is found. The first incident was reported on Aug 2, 2014. As usual when users type in user name and password at the login window and hit login the login process gets frozen and cannot get to the desktop. It appeared to only affect tsch (using .cshrc) users.

The solution is to hit ctrl-c and then rm .history to get rid of the .history file. Note that if you do it from ssh, it might cause the host to reboot.

Canopus and Algol got updated over the weekend to kernel version 2.6.32-504.23.4.el6.x86_64. Hence it seems like it is only affecting these two hosts.

-Yuping & Bruce

To access weekly online backup on canopus last updated 2024 August

Newest update 2024 Aug to reflect that backups are have been stored on canopus: for several years:

Go to canopus:/thuban2-backups

Bruce has an automatic (cron) job do a weekly backup of thuban onto canopus.

On canopus,

df -h gives the various mount points, incl
“Filesustem” /dev/mapper/centos_canopis-thuban2–backups”\
” Mounted on”  /thuban2/backups

Navigate down to whatever file you’re looknig for, and copy over to a safe place (Careful: do not overwrite newer file of same name on thuban!)

See also post on root superuser privileges on thuban2  (although current post you are reading deals with canopus, which is where weekly backups are stored).

–joel

Issues Upgrading from RHEL 5.2 to 5.3

As of the date of this posting, the latest version of Red Hat — and the version being used on all astronet machines — is 5.3, identified by kernel versions >= 2.6.18-128

If for some reason it ever becomes the case that a machine must be upgraded from 5.x to 5.3, Bruce and I ran into some hiccups in yum regarding the updating process.

Ideally, machines can be upgraded to a new RHEL release version simply by
1.) Removing any excluded packages by commenting out any exclude=XXXXX lines in /etc/yum.conf
2.) Running yum -y upgrade
3.) Coming back ~25 mins later and rebooting the machine.
4.) Uncommenting the traditional package excludes so updating can continue automatically as before

Unfortunately, as of this posting the 5.3 version of the tog-pegasus package from RedHat refuses to do an update install – and in fact hangs the update process. All of our machines that were up-and-running when 5.3 was released tried to get this package (because of our automatic update cron job) and the yum process was permanently hung.

The fix:
1.) See if any yum processes are currently being hung up by tog-pegasus
>> ps auxwww | grep -i yum
2.) If there are any yum processes running, and it looks like they’ve been running for a while, they’re probably hung. Reboot to kill the processes.
3.) When you’re back online, check again to see if any yum processes are running and kill (all of) them.
>> ps auxwww | grep -i yum
>> kill -9 <PIDs>
4.) Uninstall the following packages (tog-pegasus and openoffice must be wholly removed)
>> yum erase tog-pegasus openoffice.org-*
5.) Remove any excluded packages by commenting out any exclude=XXXXX lines in /etc/yum.conf
6.) Now try the update again
>> yum -y upgrade
7.) Come back in ~25 mins and make sure everything has completed. When yum tells you it’s done, reboot.
8.) Reinstall openoffice by running the script below or by hand with yum install
>> /etc/secret/clientconfig/install-programs/install-programs.sh
9.) Clear yum’s unfinished-transaction log so it forgets about the whole ordeal and doesn’t bug us about it
>> yum-complete-transaction –cleanup-only

Don’t bother reinstalling tog-pegasus, it’s nothing we will ever need.

**AS OF 2/4/2009 AND TO THE EXTENT OF MY KNOWLEDGE, ALL ASTRONET MACHINES ARE UPGRADED TO RHEL 5.3 AND FUNCTIONING**

Change users quota (MUST BE *THUBAN* ROOT)

type edquota <username> as root on thuban.

this drops you into a vi editing session.

change the quota as desired and then exit the session.

If you are not familiar with vi, you may find more details on how to edit this file under the “new user” post, where the setting up of a quota for a new user is discussed.

–Joel

How to restore files from old 8mm Exabyte tapes

8mm Exabyte backup tapes written between 1994 and 2003 were created with the ufsdump command on a Solaris box. We no longer have the Solaris box, but the Exabyte tape drives are now hooked up to algol (running linux) and can be used to recover the files using restore, the linux port of ufsrestore.

Unlike ufsrestore, restore requires you to specify the blocking factor used to write the files to tape. And, since the dump scripts wrote multiple file partitions to each tape, you also need to tell restore which tape file the partition was written to on the tape. Both the blocking factor and tape file index can be determined by inspecting the the old dump logs stored on algol under /docs/thuban-docs/dumps/

Example:

Joel hands you an Exabyte tape with the label “2/9/95” on it, and asks you to recover a file from the “/usr3” partition that was written to that tape.

1. Login to algol as root

2. cd /docs/thuban-docs/dumps/1995/

3. View the dump log for that date (e.g. ./02.09.95.dmp.gz) using emacs or less (which run gunzip on the fly), and look for ‘/usr3’.

4. Determine the blocking factor used for that tape. Look for a line like: “DUMP: Writing 32 Kilobyte records“, and remember that value (’32 Kilobyte’ in this case). Note that historically the blocking factor changed over time, so always check the blocking factor used for the particular dump in question.

BLOCKING FACTOR = 32KB

UPDATE 2015-06-17: Older tapes (for which there are no dump logs) often used a blocking factor of 10KB.

UPDATE 2015-06-17: If you don’t know the block size, use the commands “mt -f /dev/st0 setblk 0; mt -f /dev/st0 setdensity 0x0” to set the drive to “variable mode”. This gives you lower performance (only 1 block is transferred for each SCSI command, plus Check Condition overhead). If you do this, don’t specify the -b <blocksize> argument when using the restore command.

5. Now determine the index of the tape file the partition you want to recover. Starting at the top of the dump log file, counting from ‘1’, count the number of “DUMP: DUMP IS DONE” lines until you get to the “DUMP: DUMP IS DONE” line for your partition.

In this dump log, partition ‘/usr3’ was written as the 16th tape file to the tape, so:

TAPE FILE INDEX = 16

6. Insert the tape in the Exabyte tape drive

7. Rewind the tape. Currently the Exabyte tape drive is /dev/st0, so issue this command:

mt -f /dev/st0 rewind

8. Tell linux what the tape density is using ‘mt‘. Linux seems to autodetect this, but just to be safe… :

mt -f /dev/st0 setdensity 0x15

If you don’t know the tape density, you can set the density to 0 instead:

mt -f /dev/st0 setdensity 0

9. Tell linux which blocksize to use with ‘mt’. In this case our BLOCKING FACTOR is 32kb, which the ‘mt’ command likes to get as an actual byte count. So we multiply 32 x 1024 to get 32768.

mt -f /dev/st0 setblk 32768

If you don’t know the blocksize, set it to 0 (and don’t specify the -b <blocksize> argument to restore:

mt -f /dev/st0 setblk 0

10. cd to the directory you want to put the recovered files in.

cd /tmp/recovery/1995-02-09

11. Call the ‘restore’ command with the following syntax:

restore -a -i -v -b <blocksize> -s <tape file index> -f /dev/st<tape drive ID>

The above arguments tell ‘restore‘ that we want to recover files in interactive mode (-i), ignore volumes (-a), and run in verbose mode (-v).

Given our BLOCKING FACTOR of 32kb and TAPE FILE INDEX of 16, we issue this command:

restore -a -i -v -b 32 -s 16 -f /dev/st0

restore‘ should now forward the tape to the 16th tape file and then enter interactive mode, which is somewhat ftp-ish.

Once in interactive mode you can use ‘cd‘ and ‘ls‘ to explore the partition. You can specify the files to extract by issuing the ‘add <fname>‘ command. When <fname> is a directory ‘add‘ recursively targets the directory and it’s contents. The ‘ls‘ command prefixes the names of files ‘add‘ed to the extraction list with an ‘*’.

For a full list of commands use the ‘help‘ command.

Once you’ve told ‘recover‘ which files to recover to your local directory with ‘add‘ commands, issue the ‘extract‘ command to initiate the recovery. This can take several minutes, up to an hour in some cases, so go do something else and come back later.

When ‘recover‘ has recovered the files it will ask you:

set owner/mode for '.'? [yn]

Answer ‘n‘.

Here’s a sample interactive recovery session. In this case we restore the directory ‘./applications/p1391/’ from the ‘/usr3’ partition saved to the ‘2/9/95’ dump tape:

[root@algol /tmp/recovery/1995-02-09]$ restore -a -i -v -b 32 -s 16 -f /dev/st0
restore > ls
.:
11.29.dmp applications/ old_lang/ spool/
aipsnewest/ lang/ software/ tmp/

restore > cd applications/
restore > ls
./applications:
book/ games/ hiabs/ psrcat/ tempo/
ftptool/ graphing/ p1391/ starlink/ timing/


restore > add p1391
restore > extract

… [WAIT UP TO AN HOUR] …

set owner/mode for '.'? [yn] n
restore > quit
[root@algol
/tmp/recovery/1995-02-09]$

Kudos to James Fuller for figuring this out.

Algol can now talk to its SDLT Tape Drive

Algol has never been able to properly communicate with the SDLT tape drive. The symptoms were that you could write/read single small file to/from the tape drive, but an attempt to write a *directory* to it drive would fail with this message in shell:

tar: /dev/st1: Cannot write: Input/output error
tar: Error is not recoverable: exiting now

…and this message in /var/log/messages:

Jan 9 15:20:08 algol kernel: st1: Error with sense data: Current st1: sense key Aborted Command
Jan 9 15:20:08 algol kernel: Additional sense: Synchronous data transfer error

The first thing that James Fuller and I did was to see if the other tape drive, a sun 8mm exabyte drive known to algol as /dev/st0, was working. It worked fine. That indicated that algol’s Adaptec 39160 scsi controller card was as least partly working.

The next thing we did was reboot algol. This wasn’t really planned as part of our attempt to get the tape drive working, but rather to test the newly installed idl7.0 license manager. But it was a good thing we did. During the boot sequence we saw a screen titled “Adaptec SCSI Select: hit Ctrl-A to enter”. Prior to this time we had no idea such a utility existed. It’s basically a BIOS for the scsi controller card.

We hit ctrl-a, entered the utility and poked around. The utility enables you to see, for each of the two “channels” A & B (ports), the devices mapped to device ID’s 0-15. On channel A, only the scsi controller itself is mapped (to device ID 7). On channel B, the exabyte tape drive was mapped to device id 0, and the Quantum SDLT was mapped to device ID 6 and the scsi controlloer to device ID 7.

Each channel screen showed a matrix of the devices and parms for each device. The first such parm was “Initiate Wide Negotiation”. This rang a bell in James’ head — he remembered reading somewhere that this parm, when turned on, could cause problems for some scsi devices. So we turned it off for device 6. Note: making this change had the side effect of changing the communications speed value from 160Mb/s to 40Mb/s.

However, when we exited the utility and allowed the reboot to complete, we could write to/read from the SDLT tape drive.

I verified that it was really working by writing /home to it, reading it back to a temporary location, and then doing a recursive diff (diff -rq /home /root/foodir/home) against the original and the copy read from the tape. There were no diffs, and /home takes up 1.5G so I feel confident that Algol can really talk to the SDLT tape drive now.