Custom SQL¶
Instead of using the Query Builder tool, you can write queries by hand if you've learned some SQL. The ability to customize your SQL gives you a wider range of queries that you can create, and may also result in more efficient code.
If you write your own SQL query, you may find that testing it out using a direct database connection first is easier for troubleshooting and returns faster results.
SQL¶
Anyone can learn SQL! You may be surprised at how much you can do with just an understanding of the basics. Even if you don't end up writing your own queries, knowing SQL basics may help you understand how your data is stored, what types of information you can access, and how to tweak a query generated by the query builder.
Here are a few SQL references that may help you get started:
- SQL tutorial.
- SQL cheat sheet.
- MySQL manual. We use version 5.6.
- You can use Access to generate a SQL query for you (tips on how).
Note that there are some variations in SQL depending on the type of SQL database used -- ActionKit uses a MySQL database so you need to write the queries in MySQL.
Defining Input Parameters¶
Because you can use Django tags in your query reports, you can create queries that make use of input parameters. This allows you to require user input before a report is run so you don't have to hard code certain values into the report.
To add a parameter to a query report, include something like this:
where ccu.page_id = {page_id}
This tells ActionKit to ask the user to enter the page_id when they run the report. The parameter name, in parentheses, will be shown to the user when prompting them to enter a value and doesn't have to match the required input. In the example above, you could have {id_of_the_page_you_want}. You can give your parameters any names you'd like, as long as they consist only of letters, numbers, and underscore characters; they cannot contain spaces or punctuation.
Do not put quotes around the parameter syntax in your query, or around the values you enter into the form, as these will be added automatically.
If you don't want quotes around your parameter, use this syntax:
ORDER BY ccu.page_id LIMIT {% sql_unquoted limit %}
Be extremely careful with using sql_unquoted
- it allows anyone who can supply a report parameter to potentially inject arbitrary SQL into your reports. For ActionKit admins, this might allow someone to get access to data that their permissions might otherwise prevent them from seeing. If a report is exposed publicly, it could allow an attacker to get away with your entire database. Please tread extremely carefully.
To include the parameter autofill functionality in your own reports, name your parameter page_id
, mailing_id
, or callpage_id
, as appropriate.
Incremental Queries¶
Note
Incremental queries are primarily useful for minidashes in the admin UI.
For reports that return a single numeric value, you can write reports that update faster by only querying for new actions since the query was last run. If your query has code like where core_action.created_at between {last_run} and {now}
, the reporting system will only query for actions since the last cached run of the query, and it'll add the result it gets to the previous cached result. This doesn't work for some reports; for instance, dollars donated goes down when an old donation is reversed, but you'll never see that if your query only looks at the newest actions.
Using a where
clause that checks created_at
on large tables can slow things down when doing a full refresh, so you may want to make your query adapt and only include this clause when it will limit rows. This can be done by using template tags to make part of the SQL conditional based on the internal partial_run
variable, which is false when the report is running a full refresh. For example:
SELECT COUNT(*)
FROM core_action
{% if partial_run %}
WHERE created_at BETWEEN {last_run} AND {now}
{% endif %};
Time Zone Adjustments¶
You can use named time zones in query reports to adjust dates and times from the GMT format ActionKit uses internally to a time zone that makes sense for your organization, while respecting daylight-savings rules.
When writing query reports in SQL, you can use the convert_tz() function. For example, clients in California might want to display timestamps in their reports with convert_tz(core_action.created_at, 'GMT', 'US/Pacific'), while other clients might use 'US/Eastern', 'Europe/Amsterdam', and so forth.
You can select any of the time zone names in the industry-standard "tz" database, most of which are in region/city format, such as 'Canada/Newfoundland', 'Africa/Cairo', and 'Asia/Tokyo'. You can also use country names like "Egypt" or "Japan", although not every country is included, and sometimes the name is not what you might expect, eg Ireland is listed as "Eire". A list of standard time zone names is available here:
Using Comments¶
You can use --
, #
, or /* */
style comments in your query reports.
For comments that come after the semi-colon at the end of a report, only /* */
style comments will work, the other styles will cause an error.
Sample Queries¶
Following are some sample queries. These are not included in your built in reports, but may be useful as reference for writing your own reports.
Retrieving Users Within A Radius¶
You can build a query to find users within a radius from a zip code using {{ users_near_location }}
.
As an example, the following code line will retrieve the IDs of the users within a 20 mile radius of the 60625 zip code without prompting the user for place and radius.
SELECT first_name, last_name, city, type, phone
FROM core_user
JOIN core_phone
on(core_user.id = core_phone.user_id)
WHERE core_user.id={{ users_near_location:place=60625,radius=20 }}
Users Near A Campaign's Events¶
SELECT u.id, e.id, distance
FROM events_event e
JOIN zip_proximity zp on (e.zip = zp.zip)
JOIN core_user u on zp.nearby = u.zip
WHERE campaign_id = 1 and distance < 10;
This query is an example of how you might use the zip_proximity table. The table displays all zip code pairs within 50 mi of each other. There's an index on (zip, distance). The data looks likes this:
zip nearby same_state distance
00501 00501 1 0.0
00501 00544 1 0.2
00501 06401 0 36.5
00501 06403 0 43.0
00501 06404 0 38.9
Count of New Users From A Page With More Than 3 Actions In The Last Month¶
Change the count(*) > 3
to change the number of actions required, or the interval to look at a different time period:
SELECT
COUNT(ca.user_id)
FROM core_action ca
JOIN
(SELECT ca.user_id
FROM core_action ca
WHERE ca.created_at > now() - interval 30 day
group by 1
having COUNT(*) > 3) as a using (user_id)
WHERE ca.page_id={page_id}
AND ca.created_user=1
Displaying Multiple Custom User Field Values¶
This approach works for custom user fields or custom action fields, where you have multiple rows per user:
SELECT
u.first_name,
u.last_name,
employer_uf.value employer,
occupation_uf.value occupation
FROM core_user u
JOIN core_userfield employer_uf
on (employer_uf.name='employer'
and employer_uf.parent_id = u.id)
JOIN core_userfield occupation_uf
on (occupation_uf.name='occupation'
and occupation_uf.parent_id = u.id)
Recurring Profile Donations¶
This query shows the number of payments made toward a particular recurring profile id and the most recent payment. The recurrences field in core_orderrecurring
is currently not in use and does not show the number of payments:
SELECT
core_orderrecurring.id,
core_orderrecurring.status,
COUNT(*),
MAX(core_transaction.updated_at) last_transaction_time
FROM core_orderrecurring
JOIN core_transaction using (order_id)
GROUP by order_id;
Recently Canceled Recurring Donations¶
For PayPal and Braintree, any recurring donations that were canceled through the processor (not through the ActionKit admin), will be recorded in ActionKit daily. The "Cancel date" in this query shows the time when ActionKit recorded the cancellation:
SELECT user_id AS "User ID", order_id as "Order ID", updated_at as "Cancel date", status as "Cancel type"
FROM core_orderrecurring
WHERE updated_at BETWEEN DATE_SUB(NOW(), INTERVAL 7 DAY)
AND NOW()
AND status <> "active";
Product Order Info¶
This query shows the orders placed through a particular page. If you want to use this for fulfillment join to core_order_shipping_address
through core_order
, which has a shipping_address_id
field:
SELECT
co.id AS "Order ID",
co.updated_at AS "Donated on",
co.total AS "Donation Amount",
cpr.name,
cod.quantity,
co.status AS "Status",
cu.email AS "Email",
cu.first_name AS "First Name",
cu.last_name AS "Last Name"
FROM core_action AS ca
JOIN core_order AS co ON (co.action_id = ca.id)
JOIN core_order_detail AS cod ON (cod.order_id = co.id)
JOIN core_product AS cpr ON (cpr.id = cod.product_id)
JOIN core_order_user_detail AS cu ON (co.user_detail_id = cu.id)
WHERE ca.page_id = {page_id}
AND co.status = 'completed'
AND co.total > 0.00,
ORDER BY co.user_id ASC, co.id ASC, co.created_at DESC;
User Donation Summary Box¶
The queries below are used to generate the donation counts and totals on the individual user record.
Single orders:
SELECT
COUNT(DISTINCT core_order.id)
FROM
core_order
LEFT OUTER JOIN core_transaction ON (core_order.id = core_transaction.order_id)
INNER JOIN core_action ON (core_order.action_id = core_action.id)
INNER JOIN core_page ON (core_action.page_id = core_page.id)
LEFT OUTER JOIN core_orderrecurring ON (core_order.id = core_orderrecurring.order_id)
WHERE
(((core_transaction.type = 'sale'
AND core_transaction.status IN ('completed', '')
AND core_transaction.success = 1 )
OR core_page.type = 'Import' )
AND core_order.total > 0
AND core_order.status = 'completed'
AND core_orderrecurring.id IS NULL
AND core_order.user_id = 1 )
Recurring transactions:
SELECT
COUNT(DISTINCT core_transaction.id)
FROM
core_transaction
INNER JOIN core_order ON (core_transaction.order_id = core_order.id)
INNER JOIN core_orderrecurring ON (core_order.id = core_orderrecurring.order_id)
WHERE (core_order.user_id = 1
AND core_transaction.status IN ('completed', '')
AND core_transaction.type = 'sale'
AND core_transaction.success = 1
AND core_orderrecurring.id IS NOT NULL)
Report List By Tag¶
This query will list the id and name of each of your reports with all associated tags, ordered by tag.
SELECT
rr.id, rr.name,
group_concat(rrc.name)
FROM
reports_report rr
JOIN
reports_report_categories map ON (rr.id=map.report_id)
JOIN
reports_reportcategory rrc ON (rrc.id=map.reportcategory_id)
GROUP BY 1
ORDER BY 3
Actions, New Users, And Unsubscribes With Mailing Source In Last Week¶
The following query will return the count of user actions, new users subscribed, and users that unsubscribed within a defined time period. This query defines the time period as 7 days, you can easily change this to any period you wish.
SELECT
m.id,
# m.id in each subquery refers to the core_mailing table in the outer query
(select text
from core_mailingsubject
where mailing_id=m.id limit 1) as subj,
(select count(distinct user_id)
from core_usermailing um
where mailing_id=m.id) as sent,
(select count(distinct user_id)
from core_open
where mailing_id=m.id) as opened,
(select count(distinct user_id)
from core_click
where mailing_id=m.id and link_number is not null) as clicked,
(select count(distinct user_id)
from core_action a left join core_unsubscribeaction ua on action_ptr_id=a.id
where mailing_id=m.id and ua.action_ptr_id is null) as acted,
(select count(distinct user_id)
from core_action
where referring_mailing_id=m.id and subscribed_user=1) as referred,
(select count(distinct user_id)
from core_action a
join core_unsubscribeaction ua on a.id=action_ptr_id
where a.mailing_id=m.id) as unsubbed
FROM
# only pulling mailings here, not joining any large tables
# (otherwise the queries above would run too many times)
core_mailing m
WHERE
started_at > curdate() - interval 7 day
having sent > 10;
Note
Some notes on how the query was constructed:
started_at > curdate() - interval 7 day
gets you mailings for the past 7 days, you can change this to any interval.created_user
is 1 if a user was created, meaning they've never been in the database before, and 0 otherwise. There are two gotchas about how the data's set up:created_user
means the email has never been seen by AK before, even as a bounced/unsubscribed user. If you want that, that's fine; if not,subscribed_user=1
also includes users who had been bounced and just rejoined the list. I usedsubscribed_user
above.- If someone passes an email on to a friend, the mailing ID is stored in the
referring_mailing_id
column instead of themailing_id
column.
count(distinct created_user)
only counts distinct values ofcreated_user
, and there are at most two: 0 and 1. What you need is to count distinct values ofuser_id
but only for actions withcreated_user=1
. I usedcreated_user=1
in theWHERE
clause of the "referred" query.unsubs
count as actions. In my query above I did a left join againstcore_unsubscribeaction
to avoid counting them in the 'actions' column.
Finding Inactive Users¶
Sometimes you want to identify inactive users, usually to exclude them from mailings. The best way to handle inactive users is generally to use our Re-engagement Tool, but sometimes you will want to exclude a more specific group of users on a temporary basis. Queries for a lack of activity are often slow, but the summary_user table can help with that. Here's a query that finds gmail users who joined the main list more than 60 days ago and haven't opened or clicked in 60 days:
SELECT id
FROM core_user u
JOIN summary_user su ON (su.user_id = u.id)
WHERE su.mailbox_provider = 'gmail'
AND (
su.last_open > DATE_SUB(NOW(), INTERVAL 60 DAY)
OR su.last_click > DATE_SUB(NOW(), INTERVAL 60 DAY)
OR su.last_subscribed > DATE_SUB(NOW(), INTERVAL 60 DAY)
);
Finding active users is quick with summary_user
. Here's a query that finds people who
took three or more actions in the last three months or donated in the
last three months:
SELECT user_id
FROM summary_user
WHERE actions_last_90_days > 2
OR last_donation > DATE_SUB(NOW(), INTERVAL 90 DAY);