Modified | DDT |>| WED.MAR,990310,11:55-4 | CoSy/Home ; CoSy/Current © Coherent Systems Inc .

ImpeachVoteAnalysis.K

Extraction of Honorable Democrats in House of Representatives from CSPAN.org
vote records using Arthur Whitney`s K language

I started doing the analysis in CoSy , but CoSy suffers from the 64k limits of its antique DOS origin .

I have uploaded the files :
art1raw.txt The raw text copied from the CSPAN HTML source for the Article 1 vote .
art1.txt Cleaned up ASCII file , Article 1 .
art3.txt Cleaned up ASCII file , Article 3 .
So you can download them , download K and do your own analyses .

 
  In CoSy/APL :
 Had to do in parts so this is last iteration of reading CSPAN page source .
  
  rho Q is 20000 40000 ASCIIREAD '\COSY\ToCoSy.TXT' |>| 6363  |#|
  rho QW is Q  RPL 'ø<tr><td><b>ø\ '                |>| 5599  |#|
  rho QW is QW RPL 'ø</b></td><td></td><td>ø ; '    |>| 4497  |#|
  rho QW is QW RPL 'ø</td><td>   </td><td align=center>ø  ; ' |>| 2699  |#|
  rho QW is QW RPL 'ø</td></tr>ø '                  |>| 2177  |#|
  rho QW is QW RPL 'ø<tr></tr><tr></tr>ø'           |>| 2087  |#|

 K :
 Seeing that this cleaned up the text sufficiently , and to go further in
  CoSy would be a pain , I decided it was a good time to try to do something
  practical in  K .
 These lines can be copied into to a running  K  process to execute .
       |(|
  Q : 0: "C:/cosy/art1raw.txt"    / read the file of HTML copied from CSPAN
  # Q                             / Count of Q . Each line becomes an item .
486
  `show $ `Q                      / GUI Display Q
  2 # Q        / First couple of items in the raw file   
("<tr><td><b><A NAME=A>Abercrombie, Neil</a></b></td> <td></td><td>D-HI</td><td> </td><td align=center> NAY </td></tr>"
"<tr><td><b>Ackerman, Gary L.</b></td><td></td><td> D-NY</td><td> </td><td align=center> NAY</td></tr>")
 
Q : _ssr[ ; "<tr><td><b>" ; "" ]' Q     / K`s StringSearchReplace .
Q : _ssr[ ; "</b></td><td></td><td>" ; " | " ]' Q
Q : _ssr[ ; "</td><td>   </td><td align=center>" ; " | " ]' Q
Q : _ssr[ ; "</td></tr>" ; " " ]' Q
       / Note Inserted  '|'  character to delimit Table Data items  .
 QW : { ( 0 , & x = "|" ) _ x }' Q
       / Apply an ad hoc function to split its argument where it equals the
       / the '|' character  to each item of  Q .

 QW : QW[ & 3 = #:' QW ]   / Select those rows which split in 3 . Those are
                           / the data rows .

 r : ( QW[;1] _sm "*D-*" ) & ( QW[;2] _sm "*AYE*" )
       / Boolean selecting items where the second sub item matchs '*D-*' and
       / likewise , the third contains 'AYE' . Note , the 1st index is 0 .

    / The Democrat  Honor Roll for Article 1 Grand Jury Perjury :
 ,/' QW[ & r ]     / Select items of QW where r = 1 .  
("Goode, Virgil H., Jr. | D-VA | AYE "
 "Hall, Ralph M.        | D-TX |  AYE "
 "John, Christopher     | D-LA | AYE "
 "McHale, Paul          | D-PA | AYE "
 "Stenholm, Charles W.  | D-TX | AYE "
 "Taylor, Gene          | D-MS | AYE ")               
       /    ,/' strings the sub items of each item back together again .
       / I cleaned up the spacing in the display by hand .

       / Republican DisHonor Roll :
  ,/' QW[ & ( QW[;1] _sm "*R-*" ) & ( QW[;2] _sm "*NAY*" ) ]  
("Houghton, Amo          | R-NY | NAY "
 "King, Peter T.         | R-NY | NAY "
 "Morella, Constance A.  | R-MD | NAY "
 "Shays, Christopher     | R-CT | NAY "
 "Souder, Mark E.        | R-IN | NAY ")
                                                              
       / A couple of other ways to write the selections .
  &/ ( + QW[; 1 2 ] ) _sm ( "*D-*" ; "*AYE*" )
       / AND across the 2 columns flipped , each matched with its
       /  corresponding phrase .

  &/ ~ QW[;2] _sm/: ( "*AYE*" ; "*NAY*" )
       / AND across each item in the column 2 stringMatched with each item
       / in the list  ( "*AYE*" ; "*NAY*" ) .

  +/ ( QW[;1] _sm "*D-*" ) & ( QW[;2] _sm "*AYE*" )    
6                                                      
  +/ ( QW[;1] _sm "*R-*" ) & ( QW[;2] _sm "*NAY*" )    
5                                                      

 "c:/cosy/art1.txt" 0: ,/' QW  / Write the cleaned up items to a text
                               / file with sub items catinated back together .

 QWE :  { ( 0 , & x = "|" ) _ x }' 0: "C:/cosy/art1.txt"
                               / Read the File and split it again .
 QW ~ QWE              
1                           / Matches .


 / The Democrat  Honor Roll for Article 3 , Conspiracy to Obstruct Justice :
("Goode, Virgil H., Jr. | D-VA |  AYE "
 "Hall, Ralph M. | D-TX |  AYE "
 "John, Christopher | D-LA |  AYE "
 "McHale, Paul | D-PA |  AYE "
 "Stenholm, Charles W. | D-TX |  AYE "
 "Taylor, Gene | D-MS |  AYE ")
       |)|   
Note that this code will work with world class efficiency on gigabyte sets of data .

I`ve had a running debate with some core K programmers about the use of Boolean vectors in K versus traditional APLs . I`d ask how else they would handle the logic here . I think the only real difference is that `where ~ `& converting Booleans to Indexes replacing compression : V / iota rho V in APL .
I continue to feel the lack of true Bools in K is a substantial limitation .


 CoSy NoteComputing Environment & Language ;
CoSy The
NoteComputer
 Current , WallSt , MotorBoard , Art
Feedback :
bob@cosy.com
NB : I reserve the right to post all communications I receive or generate to CoSy website for further reflection .