PylonsHQ.

Layout: Fixed-width

MySQL fulltext search in SQLAlchemy with safe query parameters

Unknown macro: {metadata-list}
Name MySQL fulltext search in SQLAlchemy with safe query parameters
Space Pylons CookBook
Section  
Page MySQL fulltext search in SQLAlchemy with safe query parameters
Version 1.0
Status Draft
Curator Graham Higgins
Reviewed False
Author(s)  

MySQL fulltext search in SQLAlchemy with safe query parameters

I needed to be able to query a table with exact matches for some fields but fuzzy fulltext search on another. This allowed me to search on messages which had words my query did not, and found hits when my query had extra words. Standalone testing showed it worked very well. Getting it working with query parameters which would be immune to SQL Injection attacks took me a while to figure out.

Creating the FULLTEXT Index in websetup.py

I define my tables to SQLAlchemy in models/_init_.py and the ORM classes in models/domain.py.

In websetup.py I use raw SQL to add the fulltext index after the tables are created:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
from sqlalchemy import *
from myapp.models import *
    
def setup_config(command, filename, section, vars):
    app_conf = appconfig('config:' + filename)
    if not app_conf.has_key('sqlalchemy.dburi'):
        raise KeyError("No sqlalchemy database config found!")
    conn = metadata.connect(app_conf\['sqlalchemy.dburi'])
    metadata.engine.echo = asbool(app_conf.get("sqlalchemy.echo", False))  
    metadata.drop_all()
    metadata.create_all()
    
    metadata.engine.execute("ALTER TABLE kb ADD FULLTEXT(message)")
    
    session = create_session()

The MySQL docs point out that if you only have a couple entries then your FULLTEXT queries will fail because it omits data that occurs in 50% or more of the entries, just like "stop words". It also says that it's much faster to load the table with the index off then create the index rather than loading with the index turned on.

Run query with bound parameters in the controller

In my controller kb.py, I could use the same style of raw SQL as above but I'd be passing in naked user query strings which can break if they have quotes in them, leaving the DB open to SQL injection attacks. Instead, create a query with bound query parameters using the technique outlined in the SQLAlchemy manual section "Using Bind Parameters in Text Blocks". The config/routing.py file matches URLs like ".../kb/:fac/:sev/*text" and I found I had to use the asterisk-text or text with some odd characters didn't match the URL, even when the HTTP query string was URL-encoded.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
from myapp.lib.base import *
from myapp.lib.database import session_context
from myapp.models import metadata
    
class KbController(BaseController):
    
    def __before__(self):
        self.session = session_context.current
    
    def search(self, fac, sev, text):
       t = metadata.engine.text("""
        SELECT ROUND(MATCH(message) 
            AGAINST(:message), 2) 
            AS score,  facility, severity, message,
             explanation, solution, significance,
             notes1, notes2, notes3, os
        FROM kb
        WHERE MATCH(message) AGAINST(:message)
          AND facility=:facility
          AND severity=:severity
        LIMIT :limit
        """)
        c.results = t.execute(message=text, 
                              facility=fac, 
                              severity=sev, 
                              limit=100).fetchall()
        c.fac = fac
        c.sev = sev
        c.text = text
	return render_response('kb_search_results.myt')

The "MATCH...AGAINST" is the interesting part here. As part of the SELECT clause it returns a relevance rating, which I round to a couple decimal places; in the WHERE clause is where the work is done. The query runs surprisingly fast and results are returned from most to least relevant.

Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.

Powered by Pylons - Contact Administrators