>> |
08/30/11(Tue)18:24 No.29215876>>29215734 Man, I know that feel, programming is pretty much all I can do. Don't feel bad, though, it's most like just not having any experience with the standard library or syntax or anything. The jist of it is, load the archive main page (urllib2.urlopen(blah).read()), find all of the tables (re is the regex module), pick out the last one (negative list indexes go from the end), then pick out every row from that (<tr>s are Table Rows). In each row, split on the columns (<td>), and then pick out the important columns. For the date we don't want the last five characters, because that's the closing </td>, and for the number of posts we can just split on the space between the number and the "post(s)". Once that data is extracted, it checks to see if that date is in a hashmap (Might be called Dictionary in Java, it is in C#), and if so, add this post number to the dateObject instance tied to that date. If it isn't, initialise a new dateObject and stick it in the hashmap, and fuck that's not actually adding the first thread of the day to the total post numbers, damnit. Still, pretend it is.
Everything after that is just boilerplate to make the graph. |