As I try to finish up my website, I am having trouble with a
few things. Mostly time. It is amazing how much time it takes to do things on
the computer! But on the positive side, I found a cool website for topic
modeling called Regex. It doesn’t actually clean anything, but it can find and
highlight anything you want to look for in your work. So if you wanted to
double check for errors or just see what's in your documents or combine
documents, this website would be helpful. It just won’t (as far as I know) take
and replace anything. But OCR has been really difficult! And I haven’t even
tried to clean PDFs! I’m just trying to get it done with a few word documents.
But I think I might have finished my Mallet page. I still don’t really
understand it, but my tutor helped me to go through some of the basics. I used
the Command Line on the Terminal to look at three documents and see what it
spit out. I still don’t really understand what all the numbers and how it’s set
up, but it worked, I guess. I haven’t played with it enough to see how to
actually use this for my research. The format doesn't make sense to me. Maybe I
still need to put it into Excel or something. But I nonetheless have a screen
shot (another new thing for me!) of the results. I also put it into Overview.
That made even less sense, but maybe that’s because I only used three documents
(and unrelated ones – just the three on top). Maybe if I play with it more and
add more to it I will understand better. But the main thing is that I did it,
and I even have pictures of my work. It’s a baby step, but a step! (Got to end
these failure blogs on a positive note!)
No comments:
Post a Comment