Attaching Taxonomy Terms to Existing Drupal Nodes with SQL

A website I work with has a whole bunch of Taxonomy “keywords” that needed to be assigned to existing content. There’s no way I could use third party modules to assign these terms to the existing content because I’m using large data sets (~500,000). Plus, why have the overhead when I just use SQL to directly access manipulate the data.

Below is a SQL statement that will add a taxonomy association to any content that contains the taxonomy keyword:

INSERT INTO term_node
(nid, vid, tid)
SELECT q.nid, q.nid, t.tid
FROM content_type_yourtype q
JOIN term_data t
-- Make sure you only find full words, not partial matches
ON q.field_type_value LIKE CONCAT('% ',, ' %')
-- Don't try to add existing entries
WHERE NOT EXISTS (SELECT n.nid FROM term_node n WHERE n.nid = q.nid AND n.vid = q.nid AND n.tid = t.tid)
-- Clear out blanks
AND != ''
-- This is optional, I need it because my hosting provider times out on 3+ min queries...
AND q.nid >= 200000 AND q.nid < 300000 )

SCRATCH THIS. You need to setup a job to run this, it's too much work on the database:

PHP Script to use:

= $min AND q.nid < $max ");

Bash Script to automate:

for i in {1..150};
do echo "Running Job $i with $j";
php -e tax_update.php $min `expr $min + $limit`;
min=`expr $min + $limit`;

Manually Fixing / Adding Permissions to node_access Table (Drupal 6)

For some reason one of the Drupal sites that I work on for one of my clients wasn’t adding new information to the node_access table for new content being created. We noticed this because some of the results weren’t showing up in searches. (They won’t if the node doesn’t have grant_view checked.) Anyhow, I created the following query to fix this:

* This statement is just used to check how many need to be
* sync'd and can be ignored. It does let you know how many
* nodes your missing though 😉
SELECT count(*) FROM mydatabase.node n where n.nid not in (
select a.nid from mydatabase.node_access a)
and (n.type = 'content_type_a' or n.type = 'content_type_b');

* Create a temp table to insert all missing permissions.
* Note, this isn't required if you only have one access type,
* but since I three, it is necessary. This tables structure
* is just a replica of node_access table.
create temporary table s (
nid int not null,
gid int,
realm varchar(255),
grant_view tinyint(3),
grant_update tinyint(3),
grant_delete tinyint(3)

* I have three node permission groups that I have to change,
* so I will run three insert statements to populate my temp
* table. There is probably a way to combine this into one
* SQL query, but that's another days lesson.
insert into s
/* This group gets view access */
select n.nid, GID_VAL, 'term_access', 1, 0, 0
from mydatabase.node n
where n.nid not in (
select a.nid from mydatabase.node_access a
and (n.type = 'content_type_a' or n.type = 'content_type_b');

insert into s
/* This group gets view access */
select n.nid, GID_VAL, 'term_access', 1, 0, 0

insert into s
/* This group gets view, update and delete access */
select n.nid, GID_VAL, 'term_access', 1, 1, 1

* Finally add all of the missing node permissions into
* the real node_access table!
insert into mydatabase.node_access
(nid, gid, realm, grant_view, grant_update, grant_delete)
select s.nid, s.gid, s.realm, s.grant_view, s.grant_update, s.grant_delete
from s;

* You might wonder why you need the temp table, it's cause the
* where clause would return no rows after that first INSERT.
* If you only have one GID to worry about, you can forget
* the temp table step.

* Confirm you get 0 results. (No missing nodes)
SELECT count(*) FROM mydatabase.node n where n.nid not in (
select a.nid from mydatabase.node_access a)
and (n.type = 'content_type_a' or n.type = 'content_type_b');

* Remove temp.
drop table s;

Yes, you could use just rebuild all of the permissions through Drupal, but when the site has an extremely large node population, this takes hours.

Parsing Full Names into First and Last Columns in MySQL

So the other day I was importing a list of authors into one of my clients databases and I screwed up and put the entire name into the first name column. (Well really because that’s how the file I was parsing was created.) Anyway, my client wanted them separated, which made sense since the other 8000 authors were done this way. To do this I wrote the following query and ran it:

update db.content_type_auth a
# Update the last name first so you don't lose the entire name. You're just taking everything after the first word and setting that as the last name. (+2 to because of ' ')
set a.lname = substring(a.fname, length(substring_index(a.fname, ' ', 1))+2, length(a.fname)),
# Update the first name using the first word
a.fname = substring_index(a.fname, ' ', 1)

# This was needed for my specific case.
where a.nid in
SELECT nid FROM db.node where type = 'auth' and nid > 555555 and nid < 666666 )

The only side effect from the above query was that if the name had any leading white space, the first name became blank and the last name became the whole one. To fix that just add trim() to

# Last name
set a.lname = substring(trim(a.fname), length(substring_index(trim(a.fname), ' ', 1))+2, length(a.fname)),

# First name
a.fname = substring_index(trim(a.fname), ' ', 1)

I also had about 5 that were co-presenters, but that's an entirely different situation.

How To Set Up Automatic Backups of your Godaddy Website and Databases

I had a client that recently wanted to start backing up all of the data from his websites to an external hard drive. I was like, “Ok, that’s easy. GoDaddy creates daily backups of your files that you can zip up from your account and download.” Wrong.

Godaddy does create daily backs, which you can zip up and download, but not if your trying anything over 10 MB or so. Also, it DOES NOT create backups of your databases. But that’s ok, cron jobs to the rescue.

To create the necessary backups I needed to do the following:

1) Write a script that would generate backups of all the databases and store them in a private zone
2) Write a script that would create a zip file of the entire site
3) Add both of these as cron jobs

1) backupmydatabases


// Just create a sql dump of the database.
// Do this for each database that you need to backup

2) backupmyfiles


// Remove the previous backup file.
rm -f ~/private/access/fullsite_backup.tar.gz

// Create a tarball file of everything in your pocket of the server
// Dont include the actual tarball!!
tar --exclude=~/private/access/fullsite_backup.tar.gz -czf ~/private/access/fullsite_backup.tar.gz ~/*

3) Godaddy Crons

Now you just go to your Godaddy account and setup 2 cron jobs to run each of those scripts. Make sure you run the database backup script first and allow enough time for it to finish before you start the file backup script!

All you have to do now to get a backup of your entire webserver is just download the fullsite_backup.tar.gz file.

Using SQL to Update Content In One Table With Content From Another

This isn’t something new, but I found myself doing this again the other day so I figured I’d “archive” it. (Though most of this stuff will be out of date in 10 years anyway :-P)

So I have two tables; node and content_type_author (yeah we’re talking Drupal). I wanted to update the ‘title’ column of all the rows in node that were type ‘author’ with the authors first name and last name (found in the content_type_author table):

UPDATE node N, content_type_author A
SET N.title = CONCAT(IFNULL(A.field_fname_value, ''), ' ', IFNULL(A.field_lname.value, ''))
WHERE N.nid = A.nid AND N.type = 'author'

That’s it. The only really interesting part is:

CONCAT(IFNULL(A.field_fname_value, ''), ' ', IFNULL(A.field_lname.value, ''))

which says to combine the first and last name columns to create a name string, but if either the first name or last name is not there, just replace it with an empty string. (In MySQL if you try to concatenate a string with NULL you get NULL)

CONCAT(NULL, 'Strickland') => NULL
CONCAT('Brian', IFULL(NULL, '')) => 'Brian'