3章 その10 階層的クラスタの出力

名前付き引数をインクリメントだと勘違いしてた。

printclust(clust.left, labels=labels, n=n+1)

のn=n+1はpythonでは名前付き引数なので、別にインクリメントしているわけじゃなかった。


cluster.rbにprintclust関数を追加
http://www.bitbucket.org/shokai/collective-intelligence-study/src/cadaac71b6aa/03/clusters.rb

  # 階層型クラスタを出力する
  def printclust(clust,labels=nil,n=0)
    # 階層型レイアウトにするためにインデントする
    n.times do
      print ' '
    end
    if clust.id < 0
      # 負のidはこれが枝である事を示している
      puts '-'
    else
      # 正のidはこれが終端だということを示している
      if labels == nil
        puts clust.id
      else
        puts labels[clust.id]
      end
    end

    # 右と左の枝を表示する
    printclust(clust.left, labels, n+1) if clust.left != nil
    printclust(clust.right, labels, n+1) if clust.right != nil
  end

再帰してどんどん書いていく。rubyに名前付き引数は無いっぽいのでn+1だけになった



動かす
test.rbとして保存

require 'clusters.rb'

cs = Clusters.new
blognames,words,data = cs.readline('myblogdata.txt')

clust = cs.hcluster(data)
cs.printclust(clust, blognames)
ruby test.rb > blogname-tree


http://www.bitbucket.org/shokai/collective-intelligence-study/src/cadaac71b6aa/03/blogname-tree

-
 -
  Pharyngula
  -
   -
    MAKE Magazine
    -
     Jeremy Zawodny's blog
     -
      Engadget
      TMZ.com
   -
    -
     Slashdot
     -
      The Blotter
      -
       Hot Air 損 Top Picks
       -
        Little Green Footballs
        -
         NewsBusters.org - Exposing Liberal Media Bias
         -
          Power Line
          -
           Wonkette: The D.C. Gossip
           -
            The Daily Dish | By Andrew Sullivan
            Think Progress
    -
     Bloglines | News
     -
      -
       SpikedHumor - Today's Videos and Pictures
       -
        -
         Creating Passionate Users
         Schneier on Security
        -
         -
          Crooks and Liars
          -
           Daily Kos
           -
            -
             Stepcase Lifehack
             43 Folders
            -
             Talking Points Memo
             -
              ReadWriteWeb
              -
               A Consuming Experience (full feed)
               -
                Google Operating System
                -
                 GoFugYourself
                 Bloggers Blog: Blogging the Blogsphere
         -
          we make money not art
          -
           ScienceBlogs : Combined Feed
           -
            How to Change the World
            -
             -
              -
               Download Squad
               The Unofficial Apple Weblog (TUAW)
              -
               Joystiq
               Autoblog
             -
              -
               -
                Deadspin
                -
                 Lifehacker
                 -
                  Kotaku
                  -
                   Gawker
                   -
                    Valleywag
                    Gizmodo
               -
                -
                 456 Berea Street
                 -
                  Micro Persuasion
                  -
                   Signal vs. Noise
                   Seth's Blog
                -
                 Search Engine Roundtable
                 TreeHugger
              -
               Techdirt
               -
                -
                 The Full Feed from HuffingtonPost.com
                 -
                  Wired Top Stories
                  Gothamist
                -
                 Boing Boing
                 John Battelle's Searchblog
      -
       CoolerHeads Prevail
       -
        MetaFilter
        -
         Online Marketing Report
         -
          Cool Hunting
          -
           Joi Ito's Web
           -
            -
             Search Engine Watch Blog
             Neil Gaiman's Journal
            -
             The Official Google Blog
             -
              Eschaton
              The Viral Garden
 -
  Giga Omni Media, Inc.
  -
   Dave Shea's mezzoblue
   -
    flagrantdisregard
    -
     Joel on Software
     -
      Captain&apos;s Quarters
      -
       -
        blog maverick
        BuzzMachine
       -
        Copyblogger
        -
         Topix.net Weblog
         -
          -
           Steve Pavlina's Personal Development Blog
           -
            Oilman
            -
             Quick Online Tips
             -
              ProBlogger Blog Tips
              -
               Sifry&apos;s Alerts
               -
                PaulStamatiou.com
                -
                 -
                  TechCrunch
                  -
                   -
                    Google Blogoscoped
                    Matt Cutts: Gadgets, Google, and SEO
                   -
                    Publishing 2.0
                    Mashable!
                 -
                  ongoing
                  Scobleizer -- Tech geek blogger
          -
           -
            SimpleBits
            plasticbag.org
           -
            WWdN: In Exile
            -
             Shoemoney - Skills To Pay The Bills
             -
              -
               Instapundit.com (v.2)
               gapingvoid: &quot;cartoons drawn on the back of business cards&quot;
              -
               Joho the Blog
               -
                Michelle Malkin
                -
                 kottke.org
                 -
                  Derek Powazek
                  -
                   The Superficial - Because You&apos;re Ugly
                   Celebrity gossip juicy celebrity rumors Hollywood gossip blog from Perez Hilton

ちなみに、blogdata.txtでやるとp.41と同じように

                          John Battelle's Searchblog
                          -
                           Search Engine Watch Blog
                           -
                            Read/WriteWeb
                            -
                             Official Google Blog
                             -
                              Search Engine Roundtable
                              -
                               Google Operating System
                               Google Blogoscoped

が出力される。(これは出力の真ん中あたり)