Context Filters with Tableau, Important for Top N Filters

This took me FOREVER to finally figure out, so I wanted to share a method to avoid a common mistake when trying to use Tableau’s Top N or Bottom N filter.  The issue is that often times when the Top filter is applied, it applies the filter against the entire, unfiltered source data, while the user is likely expecting the Top N (or Bottom N) to be select only after all the other filters have already been applied.  Here are the steps I’ve taken in some sample data with the ultimate goal of selecting the “Top 3 Markets in Texas.”

 

Step 1: our original data.

Here, I’ve taken a list of customers by state.

01 customers in all markets

Step 2: Filter the top 3 markets.

Right-click on LocationMetro > Filter > Top tab. Then select “By Field” and enter 3.

02 top 3 metro areas

 

Step 3: Results – top 3 markets overall (still need to filter on Texas).

03 result top 3 metro areas

 

Step 4: Filter on Texas.

Wait! Our results have only 1 Market? I wanted 3 markets!

04 select TX

Step 5: Apply Context Filter on State

In order to preserve our “Top 3” filter, we must add a Context Filter. A Context Filter will apply the filter FIRST, prior to any other filters on the page.

What was happening in Step 4, was that the worksheet was choosing the “Top 3” markets out of all of the states first, and then applied the Texas filter.

 

05 click add to context

 

Step 6: Make Sure your Context Filter didn’t reset.  In this example, make sure Texas is the only state selected.

In my experience, Tableau often resets all of the filters in the context filter, which requires the user to go back a re-select the filters. In this case, all the states were selected again, so I had to go back and unselect them all and then choose Texas.

 

06-ensure-proper-filter-is-applied

 

We’re done! Our chart now shows the Top 3 Markets in Texas!

Happy filtering!

 

 

 

 

.

Python and Web Scraping (using Scrapy)

Certainly the most extensible scripting language I have ever used, Python allows the user to build powerful programs ranging from web crawling to text mining to machine learning. With invaluable packages, NumPy and SciPy, Python is able to tackle complex modeling tasks, while at the same time, other packages such as BeautifulSoup and Scrapy allow for thorough data collection through web crawling and scraping.

In the Tableau Project below, I have provided an example (with code included on the second tab) of how web crawling and data collection work, by taking a snapshot of my old motorcycle model and comparing prices from two different markets. The data was scraped using Scrapy and exported into a CSV file which I imported into Tableau.

https://public.tableausoftware.com/javascripts/api/viz_v1.js

[su_heading]Here is the Spider code:[/su_heading]

from scrapy.spider import BaseSpider
from scrapy.selector import HtmlXPathSelector
from craigslist_mcy.items import CraigslistMcyItem
import re, string
			
     
			
class MySpider2(BaseSpider):
  name = "craigmcy2"
  allowed_domains = ["craigslist.org"]
  start_urls = ["http://minneapolis.craigslist.org/search/mca?query=vulcan 900",
                "http://phoenix.craigslist.org/search/mca?query=vulcan 900",
                "http://phoenix.craigslist.org/search/mca?query=vulcan 900&s=100"]

  def parse(self, response):
      hxs = HtmlXPathSelector(response)
      
      titles = hxs.select("//p[@class='row']")
      items = []
      for title in titles:
          item = CraigslistMcyItem()
          item ["title"] = title.select("span[@class='txt']/span[@class='pl']/a/text()").extract()
          item ["link"] = title.select("span[@class='txt']/span[@class='pl']/a/@href").extract()
          item ["postedDt"] = title.select("span[@class='txt']/span[@class='pl']/time/@datetime").extract()
          item ["price"] =title.select("a[@class='i']/span[@class='price']/text()").extract()
          item ["debug"] = "" #blank for now...before, it was: title.select("a[@class='i']").extract()
          item ["location"] = re.split('[s"] ',string.strip(str(hxs.select("//title/text()").extract())))
          items.append(item)
      return items	

[su_heading]Items code:[/su_heading]

from scrapy.item import Item, Field

class CraigslistMcyItem(Item):
  title = Field()
  link = Field()
  postedDt = Field()
  price = Field()
  debug = Field()
  location = Field()

[su_heading]Run code (aka “Main”):[/su_heading]


import os
import scrapy  # object-oriented framework for crawling and scraping


os.system('scrapy list & pause')
os.system('scrapy crawl craigmcy2 -o craigslist_peter.csv')

.

Grad School Progress

The field of analytics is constantly evolving. I have enrolled in Northwestern University’s Masters of Science, Predictive Analytics program (in Evanston, IL) to help provide me with a fresh perspective on today’s top methodologies, tools, and business case studies.  You can track my grad school progress with a gantt chart that I created using Tableau. I will keep this up-to-date until I’ve earned my degree (expected 2016).