This is my method body for parsing "img src" image links from poorly formed html generated by an RSS feed... I am aware that NSXML only parses XML, but I have this hope that it can stumble through the mess to find these miniscule image links from messy html.
I'm trying to retrieve ONLY the FIRST image link found in the src attribute I find in each element name called IMG in nsData that has a src attribute and then save it to a NSString *img in another class. The img tags are not all the same, for instance an instance of nsData will contain only one image instance like any one of these:
< img class="ms-rteStyle-photoCredit" src="www.imagelinkthatineed.com" stuff I don't need
< img alt="" src="www.imagelinkineedfortableimagecellpreview" stuff I don't need
< img class="ms-rteStyle-photoCredit" src="www.IneedThisLink.com" more stuff I don't need
The only class that seems to generate NSLog output is the first one.
How can I get the parser methods to actually run ?
Given that there's a way, is there a different, simpler way you recommend?
#import "HtmlParser.h"
#import "ArticleItem.h"
@implementation HtmlParser
@synthesize elementArray;
- (HtmlParser *) InitHtmlByString:(NSString *)string {
// NSString *description = [NSString string];
NSData *nsData = [[NSData alloc] initWithContentsOfFile:(NSString *)string];
elementArray = [[NSMutableArray alloc] init];
parser = [[NSXMLParser alloc] initWithData:nsData];
parser.delegate = self;
[parser parse];
If I NSLog(@"%@", nsData); in this method body, the output spits out the raw HTML.
currentHTMLElement = [ArticleItem alloc];
return self;
}
- (void)parser:(NSXMLParser *)parser didStartElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qualifiedName attributes:(NSDictionary *)attributeDict
{
if ([elementName isEqualToString:@"img src"]) {
currentHTMLElement = [[ArticleItem alloc] init];
}
NSLog(@"%@ found a %@ element", self, elementName);
}
- (void) parser:(NSXMLParser *)parser foundCharacters:(NSString *)string
{
if (!currentHTMLElement)
currentHTMLElement = [[NSMutableString alloc] initWithString:string];
NSLog(@"Processing Value: %@", currentHTMLElement);
}
- (void) parser:(NSXMLParser *)parser didEndElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName
{
if ([elementName isEqualToString:@"img src"])
{
currentHTMLElement.img = elementName;
[elementArray addObject:currentHTMLElement];
currentHTMLElement = nil;
currentNodeContent = nil;
}
else
{
if (currentHTMLElement !=nil && elementName != nil && ([elementName isEqualToString:@"img src"]))
{
[currentHTMLElement setValue:currentHTMLElement forKey:elementName];
}
}
currentHTMLElement = nil;
}
@end
Thank you for your thoughts.
See Question&Answers more detail:
os 与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…