cleaning of social media data in NodeXL

Nov 10, 2016 at 11:32 AM
Hi
I wonder if anyone can help me - as a non programer -I need to filter out all non English content from sets of Twitter and Instagram hashtag data before undertaking the semantic visualisations - does anyone have any advice on this??

Ideally also I would like to remove all commercial content from both data sets, though I realise that this might be rather too difficult...?

All ideas are very welcome, I am a qualitative researcher trying to learn new skills and approaches to data collection and analysis.
Many thanks..... Sarah
Coordinator
Nov 10, 2016 at 7:28 PM
Edited Nov 10, 2016 at 7:28 PM
Please have a look at the NodeXL Edges worksheet.

Column AE contains the Language value created by Twitter for each Tweet:

Image

You may use the Excel filtering feature to select just the languages you want to include.

Regards,

Marc
Marked as answer by MarcSmith on 11/11/2016 at 9:15 AM
Nov 10, 2016 at 8:09 PM
Thanks Marc but does this exist for Instagram too? I need to clean both otherwise the data will be without value...

Sarah
Coordinator
Nov 10, 2016 at 10:20 PM
I am not sure about the Instagram importer, which comes from a 3rd party group.

Please see: http://snatools.com for details on their importer.

Regards,

Marc
Nov 11, 2016 at 8:05 AM
yes But I purchased NodeXL and the Instagram plugin specifically to conduct cross platform research, if I can't clean the Instagram data efficiently then the Instagram plugin is worthless. English language content filter would be a common request I suggest for many researchers, I am hoping this can be addressed. Manual cleaning is not an option.
Coordinator
Nov 11, 2016 at 4:13 PM
Understood: this is a great suggestion to direct to the SNATools team!

Regards,

Marc
Nov 12, 2016 at 3:26 PM
Edited Nov 12, 2016 at 3:30 PM
Hello
Unfortunately Instagram API doesn't support language detection ability naturally. It needs separate code to detect them. If you are a C# programmer can use this DLL:
https://www.microsoft.com/en-us/download/details.aspx?id=52575
We can add this in our update list too but it might take some time.

Regards
InstaSearcher Technical Manager (SNA Tools)
Nov 14, 2016 at 10:13 AM
Thanks _ I hope other researchers will find it useful to have an English language filter
Nov 18, 2016 at 8:37 AM
Thanks Marc - I have suggested this to them.

I am now trying to work on getting a word association network visualised for both Twitter and Instagram data as opposed to a person network, I know you did one as an exemplar back in September, but what process did you use? Did you swap stuff over in the vertices/edges?
Searching for a guide on this but can't find one...

Sarah



Coordinator
Nov 22, 2016 at 8:33 PM
For a guide to semantic network analysis with NodeXL see:

https://nodexl.codeplex.com/discussions/659371
Marked as answer by MarcSmith on 11/22/2016 at 1:33 PM