Thursday, October 04, 2007

Subquery Problems in mySQL

I'm having trouble selecting data for a web-based application. There are three tables, Articles, Keywords and KeywordsInArticle. There is a many-to-many relationship between the articles and keywords, which is handled by the piggy-in-the-middle KeywordsInArticle table. So, if we have the following scenario:

Articles
articleID  articleName
1             First Article
2             Second Article
3             Third Article
4             Fourth Article

Keywords
keywdID  keyWord
1             apples
2             oranges
3             bananas
4             grapes

KeywordsInArticle
articleID  keywdID
1             1
1             3
2             1
2             2 etc

We know that "First Article" is all about apples and bananas while "Second Article" is about apples and oranges etc. And for article #1, I can pull out a list of the relevent keywords by using the following SQL statement:

SELECT keywordsinarticle.articleID, keywordsinarticle.keywdID, keywords.keyWord
FROM keywords, keywordsinarticle
WHERE ((keywordsinarticle.keywdID=keywords.keywdID) AND (keywordsinarticle.articleID=1))
This would give the following result:
articleID   keywdID   keyWord
1             1              apples
1             3              bananas
So far, straighthforward. However, in the Content Management side of the application, there needs to be a way of managing the keywords associated with each article, to update the KeywordsInArticle table. However, this has a compound key value made up from the articleID and the keywdID. It cannot contain duplicate rows, so when adding a keyword entry to the list, I want to be able to select all keywords not already associated with that article. In the example, I need to select only "oranges" and "grapes" from the keywords table, to give them as options to add.

I want to be able to invert my selection, so I tried a using the above as a subquery and negating the result:
SELECT *
FROM keywords
WHERE keywdID NOT IN (SELECT keywordsinarticle.articleID, keywordsinarticle.keywdID, keywords.keyWord
FROM keywords, keywordsinarticle
WHERE ((keywordsinarticle.keywdID=keywords.keywdID) AND (keywordsinarticle.articleID=1)))
Sadly, mySQL throws a wobbler at this, telling me:
Message = "Operand should contain 1 column(s)"
Looking up this error at the mySQL website directed me towards the Row Subqueries page, but I'm not convinced this fits my problem either. So now I'm rather stuck. If anyone has any ideas, I'd love to hear them!

3 comments:

Anonymous said...

I cant test this right now but try:

select * from keywords kw where kw.keywdid NOT IN
(Select ka.keywdid from keywordsinarticle ka left join keywords k on ka.keywdID = k.keywordID where ka.articleID = 1)

matt

Caz Mockett said...

Thanks Matt, that sorted the problem nicely - I owe you a pint at the next Pubstandards ;-)

Anonymous said...

its not often we find people who understand these issues and feel as passionately about them as you. Thank you. From

your friends at Unix Commerce Web Design