| Modified | DDT |>| WED.MAR,990310,11:55-4 | | CoSy/Home ; CoSy/Current | © Coherent Systems Inc . |
I have uploaded the files :
art1raw.txt The raw text copied from the
CSPAN HTML source for the Article 1 vote .
art1.txt Cleaned up ASCII file , Article 1 .
art3.txt Cleaned up ASCII file , Article 3 .
So you can download them , download K and do your own analyses .
In CoSy/APL : Had to do in parts so this is last iteration of reading CSPAN page source . rho Q is 20000 40000 ASCIIREAD '\COSY\ToCoSy.TXT' |>| 6363 |#| rho QW is Q RPL 'ø<tr><td><b>ø\ ' |>| 5599 |#| rho QW is QW RPL 'ø</b></td><td></td><td>ø ; '("<tr><td><b><A NAME=A>Abercrombie, Neil</a></b></td> <td></td><td>D-HI</td><td> </td><td align=center> NAY </td></tr>"|>| 4497 |#| rho QW is QW RPL 'ø</td><td> </td><td align=center>ø K : Seeing that this cleaned up the text sufficiently , and to go further in CoSy would be a pain , I decided it was a good time to try to do something practical in K . These lines can be copied into to a running K process to execute . |(| Q : 0: "C:/cosy/art1raw.txt" / read the file of HTML copied from CSPAN # Q / Count of Q . Each line becomes an item . 486 `show $ `Q / GUI Display Q 2 # Q / First couple of items in the raw file; ' |>| 2699 |#| rho QW is QW RPL 'ø</td></tr>ø ' |>| 2177 |#| rho QW is QW RPL 'ø<tr></tr><tr></tr>ø' |>| 2087 |#|
Q : _ssr[ ; "<tr><td><b>" ; "" ]' Q / K`s StringSearchReplace .
Q : _ssr[ ; "</b></td><td></td><td>" ; " | " ]' Q
Q : _ssr[ ; "</td><td> </td><td align=center>" ; " | " ]' Q
Q : _ssr[ ; "</td></tr>" ; " " ]' Q
/ Note Inserted '|' character to delimit Table Data items .
QW : { ( 0 , & x = "|" ) _ x }' Q
/ Apply an ad hoc function to split its argument where it equals the
/ the '|' character to each item of Q .
QW : QW[ & 3 = #:' QW ] / Select those rows which split in 3 . Those are
/ the data rows .
r : ( QW[;1] _sm "*D-*" ) & ( QW[;2] _sm "*AYE*" )
/ Boolean selecting items where the second sub item matchs '*D-*' and
/ likewise , the third contains 'AYE' . Note , the 1st index is 0 .
/ The Democrat Honor Roll for Article 1 Grand Jury Perjury :
,/' QW[ & r ] / Select items of QW where r = 1 .
("Goode, Virgil H., Jr. | D-VA | AYE "
"Hall, Ralph M. | D-TX | AYE "
"John, Christopher | D-LA | AYE "
"McHale, Paul | D-PA | AYE "
"Stenholm, Charles W. | D-TX | AYE "
"Taylor, Gene | D-MS | AYE ")
/ ,/' strings the sub items of each item back together again .
/ I cleaned up the spacing in the display by hand .
/ Republican DisHonor Roll :
,/' QW[ & ( QW[;1] _sm "*R-*" ) & ( QW[;2] _sm "*NAY*" ) ]
("Houghton, Amo | R-NY | NAY "
"King, Peter T. | R-NY | NAY "
"Morella, Constance A. | R-MD | NAY "
"Shays, Christopher | R-CT | NAY "
"Souder, Mark E. | R-IN | NAY ")
/ A couple of other ways to write the selections .
&/ ( + QW[; 1 2 ] ) _sm ( "*D-*" ; "*AYE*" )
/ AND across the 2 columns flipped , each matched with its
/ corresponding phrase .
&/ ~ QW[;2] _sm/: ( "*AYE*" ; "*NAY*" )
/ AND across each item in the column 2 stringMatched with each item
/ in the list ( "*AYE*" ; "*NAY*" ) .
+/ ( QW[;1] _sm "*D-*" ) & ( QW[;2] _sm "*AYE*" )
6
+/ ( QW[;1] _sm "*R-*" ) & ( QW[;2] _sm "*NAY*" )
5
"c:/cosy/art1.txt" 0: ,/' QW / Write the cleaned up items to a text
/ file with sub items catinated back together .
QWE : { ( 0 , & x = "|" ) _ x }' 0: "C:/cosy/art1.txt"
/ Read the File and split it again .
QW ~ QWE
1 / Matches .
/ The Democrat Honor Roll for Article 3 , Conspiracy to Obstruct Justice :
("Goode, Virgil H., Jr. | D-VA | AYE "
"Hall, Ralph M. | D-TX | AYE "
"John, Christopher | D-LA | AYE "
"McHale, Paul | D-PA | AYE "
"Stenholm, Charles W. | D-TX | AYE "
"Taylor, Gene | D-MS | AYE ")
|)|
Note that this code will work with world class efficiency on gigabyte sets
of data .
I`ve had a running debate with some core K programmers about the use of
Boolean vectors in K versus traditional APLs . I`d ask how else they would
handle the logic here . I think the only real difference is that `where
~ `& converting Booleans to Indexes replacing compression :
V / iota rho V in APL .
I continue to feel the lack of true Bools in K is a substantial
limitation .
Feedback : bob@cosy.com
;
NoteComputer
NB : I reserve the right to post all communications I
receive or generate to CoSy website for further reflection .