05-20-2016 08:59 AM - edited 03-01-2019 06:41 AM
Task Name | Delete SR's via date or SR ID |
Description | |
Prerequisites | Tested on 5.4 |
Category | Workflow |
Components | vSphere 5.x |
User Inputs |
Instructions for Regular Workflow Use:
A thank you goes out to Tejeswar Sahu!
From time to time it can happen the a Single SR record can accumulate a lot of DB table entries. This is usually the result of an errant java script loop in a custom task (Like in my case). This manifests that the 100 GB UCSD appliance disk is starting to fill up and at some point a DB backup is not possible any more. The problem of deleting many DB rows is that till a DB commit happens the DB keep a temp roll back section of the items to be deleted. Since the disk is already full this will then fail.
I have several examples in this write up on how to handle this situation:
1) Truncate
This the most radical case where as in my case I do not care about older SR records and want to roll nothing back from previous SR records. This is also very fast!
2) Limit
Delete via script a few 100,000 records at a time and force a DB commit. Takes some time but you can delete just a single SR record with its millions of associated rows.
3) Workflow Task (attached)
The workflow task is a pre cursor to a GUI button that will come in UCSD version 5.5 or later. This workflow task is good for the deletion of a SR that does not suffer from millions of rows and possibly can not succeed due to UCSD appliance disk space issues.
The idea is to assess in your environment what is best to remedy the situation. In a production system one may not want to truncate due to the fact you can not roll any workflow back anymore and hence want to use the more surgical approach and remove the SR in trouble. As for everything it is highly recommended to clone a system and determine the best procedure for the environment.
Limit with script:
The Zip File Top Service Request Log has the following content in aiding to find the larges SR ID with the most table rows:
TopServiceRequestLog.sh :
Identify service requests having large number of SR logs.
DeleteServiceRequestLog.sh :
Delete the SR logs for the given SR. Number of records deleted and committed can be controlled by setting the ‘limit’ variable value in the script. By default the script deletes 100K records at time, commit it, sleep 1 seconds between each delete operation.
The run of these two scripts looks like this:
In my case I ended up with a single SR ID that had 400 Million rows in the DB (Yes bad java script coding on my part!):
[root@localhost tmp]# ./TopServiceRequestLog.sh
Querying db for number of records in SERVICE_REQUEST_LOG table
........................................................................................................................................................... done
Querying db for top 20 SR LOG
........................................................................................................................................ done
DONE
LOG_FILE=/tmp/TopServiceRequestLog.log
[root@localhost tmp]# cat /tmp/TopServiceRequestLog.log
[Fri May 20 09:45:52 CDT 2016] Querying db for number of records in SERVICE_REQUEST_LOG table
count(*)
441531721
[Fri May 20 09:51:03 CDT 2016] Querying db for top 20 SR LOG
count(*) SRID
441219679 5959
7408 6066
2576 6350
2576 6351
2574 6349
1199 6581
1199 6613
1199 6645
1199 6677
1199 6592
1199 6624
1199 6656
1199 6688
1199 6603
1199 6635
1199 6667
1199 6582
1199 6614
1199 6646
1199 6678
[Fri May 20 09:55:35 CDT 2016] DONE
[root@localhost tmp]# chmod +x DeleteServiceRequestLog.sh
[root@localhost tmp]# ./DeleteServiceRequestLog.sh
Enter SR ID: 5959
Enter number of SR logs: 441219679
Deleting service request log from SERVICE_REQUEST_LOG table for SR
............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................
The workflow tasks do not limit the delete and can fill up the file system. You will have to determine the right procedure in your situation. In my case I had very little DB space left and had to use the script version.
Here is the script to find the largest consumers:
#!/bin/bash
topN=20
LOG_FILE=/tmp/TopServiceRequestLog.log
#rm -rf $LOG_FILE
log_msg () {
printf "$1"
printf "[`date`] $1" >> $LOG_FILE
}
printProgressPID() {
PID=$1
while kill -0 $PID 2>/dev/null; do
printf "."
sleep 2
done
printf " done\n"
}
log_msg "Querying db for number of records in SERVICE_REQUEST_LOG table\n"
nohup mysql -u root -pcloupia -A db_private_admin -e "select count(*) from SERVICE_REQUEST_LOG" >> $LOG_FILE 2>&1 &
PID=$!
printProgressPID $PID
log_msg "Querying db for top $topN SR LOG\n"
nohup mysql -u root -pcloupia -A db_private_admin -e "select count(*), SRID from SERVICE_REQUEST_LOG group by SRID order by count(*) desc limit $topN" >> $LOG_FILE 2>&1 &
PID=$!
printProgressPID $PID
log_msg "DONE\n"
echo "LOG_FILE=$LOG_FILE"
Here is the delete script to delete the largest consumer:
#!/bin/bash
printf "Enter SR ID: "
read SRID
printf "Enter number of SR logs: "
read NumOfSrLogs
limit=100000
printf "Deleting service request log from SERVICE_REQUEST_LOG table for SR $SR_ID\n"
toBeDeleted=$NumOfSrLogs
while [ $toBeDeleted -gt 0 ]; do
mysql -u root -pcloupia -A db_private_admin -e "DELETE FROM SERVICE_REQUEST_LOG WHERE SRID=$SRID ORDER BY SERVICE_REQUEST_LOG_ID limit $limit"
sleep 1
printf "."
let toBeDeleted-=$limit
done
printf " done\n"
Here are other tables that will also have to be taken into consideration in the delete process (above scrip needs to mod'ed):
DELETE FROM db_private_admin.SERVICE_REQUEST where requestid = 1;
DELETE FROM db_private_admin.SERVICE_REQUEST_ASSET_CHANGE where requestid = 1;
DELETE FROM db_private_admin.WF_INPUT_VALUE where inputsetid = 1;
DELETE FROM db_private_admin.SERVICE_REQUEST_LOG where srId = 1;
DELETE FROM db_private_admin.WORKFLOW_DETAILS where requestId = 1;
DELETE FROM db_private_admin.WORKFLOWSTEP_DETAILS where requestid = 1;
DELETE FROM db_private_admin.EXECUTEDSERVICEREQUESTSNAPSHOT where srId = 1;
This should all be done on a cloned version of your UCSD install!
Truncate by hand
Checking problem Table
=====================
ls -ltrh /var/lib/mysql/data/db_private_admin/SERVICE_REQUEST_LOG.ibd
56G /var/lib/mysql/data/db_private_admin/SERVICE_REQUEST_LOG.ibd
Stopping UCSD application
=========================
/opt/infra/stopInfraAll.sh
Logging into the DB
===================
mysql -u admin --password=cloupia db_private_admin -A
Looking at the table in question
==================================
mysql> describe SERVICE_REQUEST_LOG;
+------------------------+---------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+------------------------+---------------+------+-----+---------+-------+
| SERVICE_REQUEST_LOG_ID | bigint(20) | NO | PRI | NULL | |
| MESSAGE | varchar(4096) | YES | | NULL | |
| SEVERITY | int(11) | NO | | NULL | |
| SRID | int(11) | NO | MUL | NULL | |
| TIMESTAMP | bigint(20) | NO | | NULL | |
+------------------------+---------------+------+-----+---------+-------+
5 rows in set (0.00 sec)
truncate table SERVICE_REQUEST_LOG (see below for limit statement);
===================================
mysql> truncate table SERVICE_REQUEST_LOG;
Query OK, 0 rows affected (2 min 52.45 sec)
===========================================
mysql> select count(*) from SERVICE_REQUEST_LOG;
+----------+
| count(*) |
+----------+
| 0 |
+----------+
1 row in set (0.00 sec)
truncate table SERVICEREQUEST_PROVPARAMLIST;
==============================================
truncate table SERVICEREQUEST_PROVPARAMLIST;
exit
If the truncate is to radical then the following procedure can be used:
Single limit delete:
mysql -u admin --password=cloupia db_private_admin -A -e "DELETE FROM db_private_admin.SERVICE_REQUEST_LOG where srId = 5959 LIMIT 1000000"
Limit delete with loop (10x100000):
for i in `seq 1 10`; do
mysql -u admin --password=cloupia db_private_admin -A -e "DELETE FROM db_private_admin.SERVICE_REQUEST_LOG where srId = 5959 LIMIT 1000000"
done
Limit delete with loop(804x100000):
for i in `seq 1 804`; do
mysql -u admin --password=cloupia db_private_admin -A -e "DELETE FROM db_private_admin.SERVICE_REQUEST_LOG where srId = 5959 LIMIT 1000000"
done
In order to make the DB faster after the mass delete:
Optimize DB
=============
cd /tmp/DbOptimize
sh RunDbOptimize.sh
Checking Table Extents
=====================
cd /var/lib/mysql/data/db_private_admin
filefrag *.ibd | awk '{print $1$2}' | sed 's/.ibd:/,/' | sort -nr --field-separator=',' -k2 | awk -F, '{print $2}' | awk '{total = total + $1}END{print total}'
11361
DB optimize
===========
mysqlcheck -u root -p --auto-repair --optimize --all-databases
Checking Table Extents again
==========================
cd /var/lib/mysql/data/db_private_admin
filefrag *.ibd | awk '{print $1$2}' | sed 's/.ibd:/,/' | sort -nr --field-separator=',' -k2 | awk -F, '{print $2}' | awk '{total = total + $1}END{print total}'
11439
Checking Problem Tabel
=====================
ls -ltrh /var/lib/mysql/data/db_private_admin/SERVICE_REQUEST_LOG.ibd
112K Feb 8 13:50 /var/lib/mysql/data/db_private_admin/SERVICE_REQUEST_LOG.ibd
Checking disk space on UCSD system
===============================
df -h
Filesystem Size Used Avail Use% Mounted on
/dev/sda3 97G 21G 71G 23% /
/dev/sda1 194M 35M 149M 19% /boot
tmpfs 5.9G 0 5.9G 0% /dev/shm
71 GB OPEN !!!
Start the UCSD application
============================
/opt/infra/startInfraAll.sh
Orf,
This is awesome! It happens from time to time that a single SR adds millions of records to the SR Log History table, and we need to take manual steps to clear those records. Those shell scripts should make our lives much easier.
I had a couple of thoughts/questions to make this article more clear to me...
Your article above has two sections:
Two requests:
Thanks!
Eric
I added more info. Take a look. thx
Find answers to your questions by entering keywords or phrases in the Search bar above. New here? Use these resources to familiarize yourself with the community: