Attaching Taxonomy Terms to Existing Drupal Nodes with SQL

A website I work with has a whole bunch of Taxonomy “keywords” that needed to be assigned to existing content. There’s no way I could use third party modules to assign these terms to the existing content because I’m using large data sets (~500,000). Plus, why have the overhead when I just use SQL to directly access manipulate the data.

Below is a SQL statement that will add a taxonomy association to any content that contains the taxonomy keyword:


INSERT INTO term_node
(nid, vid, tid)
(
SELECT q.nid, q.nid, t.tid
FROM content_type_yourtype q
JOIN term_data t
-- Make sure you only find full words, not partial matches
ON q.field_type_value LIKE CONCAT('% ', t.name, ' %')
-- Don't try to add existing entries
WHERE NOT EXISTS (SELECT n.nid FROM term_node n WHERE n.nid = q.nid AND n.vid = q.nid AND n.tid = t.tid)
-- Clear out blanks
AND t.name != ''
-- This is optional, I need it because my hosting provider times out on 3+ min queries...
AND q.nid >= 200000 AND q.nid < 300000 )

SCRATCH THIS. You need to setup a job to run this, it's too much work on the database:

PHP Script to use:


= $min AND q.nid < $max ");

Bash Script to automate:


min=202500;
limit=2000;
for i in {1..150};
do echo "Running Job $i with $j";
php -e tax_update.php $min `expr $min + $limit`;
min=`expr $min + $limit`;
done

Manually Fixing / Adding Permissions to node_access Table (Drupal 6)

For some reason one of the Drupal sites that I work on for one of my clients wasn’t adding new information to the node_access table for new content being created. We noticed this because some of the results weren’t showing up in searches. (They won’t if the node doesn’t have grant_view checked.) Anyhow, I created the following query to fix this:


/*
* This statement is just used to check how many need to be
* sync'd and can be ignored. It does let you know how many
* nodes your missing though 😉
*/
SELECT count(*) FROM mydatabase.node n where n.nid not in (
select a.nid from mydatabase.node_access a)
and (n.type = 'content_type_a' or n.type = 'content_type_b');

/*
* Create a temp table to insert all missing permissions.
* Note, this isn't required if you only have one access type,
* but since I three, it is necessary. This tables structure
* is just a replica of node_access table.
*/
create temporary table s (
nid int not null,
gid int,
realm varchar(255),
grant_view tinyint(3),
grant_update tinyint(3),
grant_delete tinyint(3)
);

/*
* I have three node permission groups that I have to change,
* so I will run three insert statements to populate my temp
* table. There is probably a way to combine this into one
* SQL query, but that's another days lesson.
*/
insert into s
/* This group gets view access */
select n.nid, GID_VAL, 'term_access', 1, 0, 0
from mydatabase.node n
where n.nid not in (
select a.nid from mydatabase.node_access a
)
and (n.type = 'content_type_a' or n.type = 'content_type_b');

insert into s
/* This group gets view access */
select n.nid, GID_VAL, 'term_access', 1, 0, 0
...

insert into s
/* This group gets view, update and delete access */
select n.nid, GID_VAL, 'term_access', 1, 1, 1
...

/*
* Finally add all of the missing node permissions into
* the real node_access table!
*/
insert into mydatabase.node_access
(nid, gid, realm, grant_view, grant_update, grant_delete)
select s.nid, s.gid, s.realm, s.grant_view, s.grant_update, s.grant_delete
from s;

/*
* You might wonder why you need the temp table, it's cause the
* where clause would return no rows after that first INSERT.
* If you only have one GID to worry about, you can forget
* the temp table step.
*/

/*
* Confirm you get 0 results. (No missing nodes)
*/
SELECT count(*) FROM mydatabase.node n where n.nid not in (
select a.nid from mydatabase.node_access a)
and (n.type = 'content_type_a' or n.type = 'content_type_b');

/*
* Remove temp.
*/
drop table s;

Yes, you could use just rebuild all of the permissions through Drupal, but when the site has an extremely large node population, this takes hours.

Parsing Full Names into First and Last Columns in MySQL

So the other day I was importing a list of authors into one of my clients databases and I screwed up and put the entire name into the first name column. (Well really because that’s how the file I was parsing was created.) Anyway, my client wanted them separated, which made sense since the other 8000 authors were done this way. To do this I wrote the following query and ran it:


update db.content_type_auth a
# Update the last name first so you don't lose the entire name. You're just taking everything after the first word and setting that as the last name. (+2 to because of ' ')
set a.lname = substring(a.fname, length(substring_index(a.fname, ' ', 1))+2, length(a.fname)),
# Update the first name using the first word
a.fname = substring_index(a.fname, ' ', 1)

# This was needed for my specific case.
where a.nid in
(
SELECT nid FROM db.node where type = 'auth' and nid > 555555 and nid < 666666 )

The only side effect from the above query was that if the name had any leading white space, the first name became blank and the last name became the whole one. To fix that just add trim() to


# Last name
set a.lname = substring(trim(a.fname), length(substring_index(trim(a.fname), ' ', 1))+2, length(a.fname)),


# First name
a.fname = substring_index(trim(a.fname), ' ', 1)

I also had about 5 that were co-presenters, but that's an entirely different situation.