Showing posts with label statistics. Show all posts
Showing posts with label statistics. Show all posts

Tuesday, September 30, 2014

5 Years of Continuous Blogging, Part 2


In my article about blogging 5 years (part 1), I emphasize quantities of articles over the last five years. This time, I emphasize numbers of words for articles. The images indicate distribution in scatter plots, a line chart, and a box-and-whiskers chart.

The scatter plot for the third year indicates a narrower stream of word counts than the other years. However, it's also the year with the fewest articles. The spread between high and low for word count is as follows:

Sep 2009 to Aug 2010
1323
Sep 2010 to Aug 2011
895
Sep 2011 to Aug 2012
613
Sep 2012 to Aug 2013
1277
Sep 2013 to Aug 2014
852

The line chart shows data points for high, average (mean), median, and low word counts. The box-and-whiskers chart displays the bunching of data. The "whiskers" in the box-and-whiskers graph display the end points to the "box". For each unit:
  • The box represents the middle 50% of the word-count range.
  • The whiskers each depict the other 50%—25% at each end.
  • The line inside the box depicts the median of the word count (spreadsheet-sort of word counts by article and establishing the midpoint).
For each year's period, the line chart numbers for most words, fewest words, and medians coincide with end points and medians in the box-and-whiskers chart. Note that the fifth year "box" (smallest of the five year periods) shows that half the articles fall between approximately 400 and 600 words, with the median around 450.

Some handy resources on graphing, with the first two being the most helpful for me:
Excerpt from Box plot that summarizes "whiskers":
lines extending vertically from the boxes (whiskers) indicating variability outside the upper and lower quartiles
For "5 Years Continuous Blogging, Part 3", I will go into more detail about the road I traveled in obtaining and processing my data.

October 31, 2014: Links to the series
  1. 5 Years of Continuous Blogging, Part 1
    Focus on single- and five-year views for total articles, articles with images, and recipe articles.
  2. 5 Years of Continuous Blogging, Part 2
    Focus on numbers of words in articles and graphical representations.
  3. 5 Years of Continuous Blogging, Part 3
    More details on collecting the data.
  4. 5 Years of Continuous Blogging, Part 4
    Emphasis on data sorting and distribution of word count groupings.

Saturday, August 28, 2010

Rear-viewing Year of Blogging

I started blogging just about a year ago, having decided my output would be three times a month, which breaks down to one about every ten-day period. For some people, that's way too seldom. Well, that's the pace I can live with. I want to put out quality, well-thought-out writing that frequently includes links, which are often time-consuming to vet.

Journey start
I joined my writing clublet, TheWriteJob, a little over a year ago to meet other writers and would-be writers. The community blog sparked my interest in contributing to it, and to eventually launch my own blog. After having published six articles there, I registered for my own blog. I ported my previous articles over (truncated and linked the earlier articles); and have been publishing here since.

Theme
In setting up my blogspot, I thought about my theme. I came up with "writing mostly for language enlightenment, entertainment, and a-muse-meant". It became more of a guide for me to determine my article topics. Low standards—if I fulfill any of those broad categories for an article, I succeed in achieving my topic goal.

Theme expansion into categories
A few months ago, I added a line to the theme, as I felt categories were starting to pop up. My category labels—language, tech communications, EZ recipes, food, wordplay, humor, music, tech topics, and how-to's—also form the basis of my article today, compiling and analyzing stats of my year in blogging. I'm omitting discussion of Google Analytics. I use them, but don't have enough of a fan base or readership to report anything impressive. :-)

First compilation file
*LinkedIn membership required to view this file*
Awhile back I had created an compilation file that included the article title, linked url, publish timeframe (early, mid, late part of specified month), and summary. The format was 2-column landscape. Recently, I decided to redo the compilation file. Numerous times of adding and removing column breaks with every update to make the file look nice started to irk me.

Second compilation file
(Newer! Improved! Now with category descriptors!)

*LinkedIn membership required to view this file*
The impetus to change the formatting was wanting to categorize the articles, logically the descriptors I thought of. Also, I knew I'd want to write and time an article pertaining to the 1-year milestone. I removed the column formatting and breaks, then converted it into an 11-column table. The first column has the title, URL, and summary, the second column has the date I published, and the rest of the columns have the category descriptors and check marks. Because food is near and dear to my heart, I highlighted food rows in yellow to make them stand out.

For each article, my new compilation file has check marks in the categories I consider appropriate. For further enhancement, I highlighted the rows that had food themes. I did pause over designating some category names for a few articles. For instance, can a food article be a tech article? Yes, I decided "Wanted Unholed Lotta Bagel" fit the descriptor of tech topics because of history, techniques, and related background.

I waffled (food!) over articles about language and technical communications. Most that fit in one category also fit the other category. In looking at my table (place for food!), language was more predominant than my profession of tech comm (writing, editing).

Stats (drum roll! yum!)
Since September 6, 2009, I have published 36 articles. I don't include the current article in my stats, although I will have updated my table to include it (code green). Deciding categories was the longest part of the process. The fun part was tallying everything—the number of check marks for each descriptor, the number of checkmarks for each article—first for each of the five pages of my printout (yes, hardcopy!), then adding them up. Natch, if I had a LOT to tally, I would have put everything into Excel. I used Word. (Gasp!)

Category
Qty check marks
Language
20
Tech communications
14
EZ recipes
8
Food
11
Wordplay
15
Humor
21
Music
10
Tech topics
16
How to's
18

Articles with the most descriptors—a 3-way tie with 6 descriptors each
Fish Fries Telephone
Wanted Unholed Lotta Bagel
Technical Communications Means

Articles with the 2nd most descriptors—a 5-way tie with 5 descriptors each
Vocabs of Steel
Greater Less Fewer More Thans--More or Less
Bad-Prose Rants from Lady Wawa
Pronunciations Heck with Hermione and Homage
Color N R Lives

Rest of article quantities (titles omitted)
Note to novice statisticians: I tic-marked the article quantities and added them up to confirm they total 36—no duplicated counts and no undercounts.

Qty
articles
Qty
descriptors
2
4
10
3
4
2
2
1

Categories for this article
For this article, I would categorize it into technical communication, food (coupla nibbles!), humor (minor rib ticklers!), tech topics, and how to's. I don't consider light mentioning of the other categories to quite warrant checking off all the descriptors. :-) Although I did not include numbered steps that indicate a process, I think there's enough of a road map feel here for people who want to put information on a grid.