Fala galera!
Tudo bem com vocês ?

Neste post eu gostaria de compartilhar uma solução que precisei desenvolver na minha empresa para realizar um web scraping e extrair dados das agências do site dos Correios (http://www2.correios.com.br/sistemas/agencias/). Como vocês devem imaginar, para atingir essa solução e trazer os dados para dentro do meu banco de dados rapidamente, utilizei o meu velho companheiro SQLCLR para realizar essa tarefa não tão simples, já que o site dos correios continha alguns erros no XHTML do site e isso acabou complicando um pouco a minha vida para tratar os dados.

Para conseguir tratar esses dados, vou utilizar técnicas de expressões regulares (Regexp ou Regex), as quais demonstrei seu uso no meu último post SQL Server – Como utilizar expressões regulares (RegExp) no seu banco de dados, portanto, recomendo a leitura desse artigo de Regexp antes desse artigo aqui.

Informação no site de origem:

Informação retornada pela Stored Procedure utilizando Web Scraping:

Como funciona a solução na teoria

Vou tentar resumir os passos para realizar o web scraping e explicar como fiz para conseguir atingir esse resultado.

  • O primeiro passo é identificar a URL utilizada para retornar os dados via Ajax no site

  • Agora que identifiquei a URL, o próximo passo é garantir que o nome da cidade não tenha acentos (Ex: Vitória). Para isso, vou utilizar a função fncRemove_Acentuacao (vou liberar os códigos mais abaixo).
  • Outro ponto importante, é codificar a URL para transformar espaços e outros caracteres como entidades HTML. (Ex: Espaço = %20). Para isso, vou utilizar a função fncCodifica_URL
  • Agora já posso montar a URL completa para enviar a requisição HttpRequest. Para isso, vou utilizar a Stored Procedure stpWs_Requisicao, que realiza a requisição e retorna o resultado em uma string de output
  • O retorno dessa requisição é o código HTML retornado pelo site, que é nesse formato:

  • Como vocês podem reparar, não é uma tarefa trivial tratar esse XHTML, ainda mais que ele contém alguns elementos que são abertos e não são fechados, gerando erro na hora de tentar converter para XML e tratar pelo SQL Server. Diante disso, vou tratar o retorno utilizando outra técnica: Expressão regular (RegExp)
  • Utilizo a máscara “<td .*?>(.*?)</td>” na função fncRegex_Match para retornar todos os valores que estão dentro das tags td. Aliado a isso, utilizo também as funções fncRemove_Html_String para transformar uma string HTML para texto e também a função fncRecupera_Letras_Sem_Caractere_Especial para remover quaisquer caracteres especiais na string.
  • A partir daí, vou aplicar alguns filtros para separar e categorizar os resultados desses tratamentos e assim conseguir classificar corretamente as informações e retornar esses dados no formato de tabela

Código-fonte da solução

Agora que expliquei na teoria como funciona a solução, chegou a hora de aplicá-la e demonstrá-la na prática.

Código-fonte do assembly SQLCLR, que será utilizado nesse artigo (T-SQL):

USE [dirceuresende]
GO

ALTER DATABASE [dirceuresende] SET TRUSTWORTHY ON
GO


IF (OBJECT_ID('dbo.fncCodifica_URL') IS NOT NULL) DROP FUNCTION [dbo].[fncCodifica_URL]
GO

IF (OBJECT_ID('dbo.fncRecupera_Letras_Sem_Caractere_Especial') IS NOT NULL) DROP FUNCTION [dbo].[fncRecupera_Letras_Sem_Caractere_Especial]
GO

IF (OBJECT_ID('dbo.fncRegex_Match') IS NOT NULL) DROP FUNCTION [dbo].[fncRegex_Match]
GO

IF (OBJECT_ID('dbo.fncRemove_Acentuacao') IS NOT NULL) DROP FUNCTION [dbo].[fncRemove_Acentuacao]
GO

IF (OBJECT_ID('dbo.fncRemove_Html_String') IS NOT NULL) DROP FUNCTION [dbo].[fncRemove_Html_String]
GO

IF (OBJECT_ID('dbo.stpWs_Requisicao') IS NOT NULL) DROP PROCEDURE [dbo].[stpWs_Requisicao]
GO

IF (EXISTS(SELECT NULL FROM sys.assemblies WHERE name = 'Webscraping_Correios')) DROP ASSEMBLY [Webscraping_Correios]
GO

CREATE ASSEMBLY [Webscraping_Correios]
    AUTHORIZATION [dbo]
    FROM 0x4D5A90000300000004000000FFFF0000B800000000000000400000000000000000000000000000000000000000000000000000000000000000000000800000000E1FBA0E00B409CD21B8014CCD21546869732070726F6772616D2063616E6E6F742062652072756E20696E20444F53206D6F64652E0D0D0A2400000000000000504500004C010300B1B6045B0000000000000000E00022200B0130000018000000060000000000000A370000002000000040000000000010002000000002000004000000000000000600000000000000008000000002000000000000030060850000100000100000000010000010000000000000100000000000000000000000B83600004F00000000400000D802000000000000000000000000000000000000006000000C000000803500001C0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000200000080000000000000000000000082000004800000000000000000000002E7465787400000010170000002000000018000000020000000000000000000000000000200000602E72737263000000D80200000040000000040000001A0000000000000000000000000000400000402E72656C6F6300000C0000000060000000020000001E00000000000000000000000000004000004200000000000000000000000000000000EC360000000000004800000002000500B4270000CC0D000001000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001B3004006F01000001000011000F02280700000A2D090F02280800000A2B0572010000700A0F01280700000A2D090F01280800000A2B0572030000700B0F00280800000A0C7E0900000A0D0F03280700000A2D090F03280800000A2B05720B00007013041104280A00000A130611062C07720B000070130408280B00000A740800000113051105076F0C00000A00077217000070280D00000A2C0B066F0E00000A16FE022B0116130711072C5100061308280F00000A11086F1000000A1309110511098E696A6F1100000A0011056F1200000A130A00110A11091611098E696F1300000A00110A6F1400000A0000DE0D110A2C08110A6F1500000A00DC0011056F1600000A740A000001130B00110B6F1700000A130C00110C14FE03130D110D2C2B00110C1104281800000A731900000A130E00110E6F1A00000A0D00DE0D110E2C08110E6F1500000A00DC0000DE0D110C2C08110C6F1500000A00DC00DE0D110B2C08110B6F1500000A00DC0E0409281B00000A81070000012A00013400000200C9001BE4000D00000000020027010C33010D0000000002000A013A44010D00000000020000015454010D000000002202281C00000A002A000000133003004500000002000011000F00280700000A2D310F00280800000A722100007072270000706F1D00000A7237000070723B0000706F1D00000A281E00000A281B00000A2B057E1F00000A0A2B00062A000000133002008200000003000011000F00280700000A0C082C0914281B00000A0D2B6B0F00280800000A186F2000000A0A732100000A0B000613041613052B35110411056F2200000A1306001106282300000A130711071BFE0116FE01130811082C0B000711066F2400000A260000110517581305110511046F0E00000A32C0076F2500000A281B00000A0D2B00092A00001B3002006A00000004000011000F00280700000A2D090F01280700000A2B01170A062C087E1F00000A0B2B4800000F00280800000A0F01280800000A282600000A0C086F2700000A2D077E1F00000A2B16086F2800000A166F2900000A6F2A00000A281B00000A0BDE0A0D007E1F00000A0BDE00072A000001100000000021003D5E000A10000001133004007E000000050000110002280A00000A0B072C04140C2B6D020A06282B00000A0A030D092C380006723F000070282C00000A17282D00000A0A06724B000070282C00000A17282D00000A0A067259000070282C00000A17282D00000A0A00067263000070727100007017282D00000A0A0672750000707E0900000A282E00000A0A060C2B00082A00001B3003006A03000006000011000F00280700000A130511052C0C7E1F00000A1306384D0300000F01282F00000A130711072C0C7E1F00000A130638340300000F03282F00000A2D090F03283000000A2B01160A0F02280700000A2D090F02280800000A2B0572010000700B0F01283000000A0C062D140F00280800000A186F2000000A6F3100000A2B0C0F00280800000A6F3100000A0D732100000A130400097E02000004252D17267E01000004FE060B000006733200000A258002000004280100002B6F3400000A1308387802000011086F3500000A13090008130A110A39550100000006130B110B39EB0000000011091F41320911091F5A3EAD00000011091F61320911091F7A3E9E00000011091F30320911091F393E8F00000011091F203B86000000110920C00000003209110920CF0000003174110920D10000003209110920D60000003162110920D90000003209110920DD0000003150110920E00000003209110920E4000000313E110920E70000003209110920EF000000312C110920F10000003209110920F6000000311A110920F9000000320E110920FD000000FE0216FE012B01162B0117130C110C2C0C110411096F2400000A262B18076F0E00000A16FE02130D110D2C091104076F3600000A26002B590011091F41320611091F5A312011091F61320611091F7A311411091F30320611091F39310811091F20FE012B0117130E110E2C0C110411096F2400000A262B18076F0E00000A16FE02130F110F2C091104076F3600000A260000380E01000000061310111039C40000000011091F20320911091F7E3E86000000110920C00000003209110920CF0000003174110920D10000003209110920D60000003162110920D90000003209110920DD0000003150110920E00000003209110920E4000000313E110920E70000003209110920EF000000312C110920F10000003209110920F6000000311A110920F9000000320E110920FD000000FE0216FE012B01162B0117131111112C0C110411096F2400000A262B18076F0E00000A16FE02131211122C091104076F3600000A26002B3E0011091F20320B11091F7EFE0216FE012B0116131311132C0C110411096F2400000A262B18076F0E00000A16FE02131411142C091104076F3600000A2600000011086F3700000A3A7CFDFFFFDE0D11082C0811086F1500000A00DC11046F2500000A281B00000A13062B0011062A0000411C000002000000BF0000008B0200004A0300000D000000000000002E730A00000680010000042A3603282300000A1BFE0116FE012A000042534A4201000100000000000C00000076342E302E33303331390000000005006C0000008C040000237E0000F80400005406000023537472696E6773000000004C0B00008400000023555300D00B0000100000002347554944000000E00B0000EC01000023426C6F62000000000000000200000157150208090A000000FA013300160000010000002700000004000000020000000B00000010000000370000000A0000000600000003000000010000000400000001000000010000000000F40201000000000006000302A50406002302A5040600C40192040F00C50400000600AD054B030A00D8013E040A008702D4040E00E305B40506003B039B000E007201B405060018049B000A00EE013E0406003004FB0506001C066F030E00A3020305060094034B030A006403D40406004A00EB000600A901A504060058004B03060093024B030E00E705B40506005802FB05060040014B030E007601B405060025049B000E00C6024B0306005203FB050600ED036F030E00070603050E00060403050E00840303050E006A0103050E003806B4050600CB054B030E004A050305120035010C0406003C00EB0006006B042205000000006C00000000000100010001001000E9040000150001000100010010003505000015000100030003211000E7000000150001000900360068006E011600010072015020000000009600C2037A01010000220000000086188504060006000C220000000096008400890106006022000000009600AD0389010700F0220000000096009A02900108007823000000009600610299010A000424000000009600CA029F010C00002200000000861885040600100098270000000091188B04AC0110000022000000008618850406001000A4270000000083000A00B001100000000100180300000200E303000003006C05000004009E0302000500A50000000100940000000100FD0300000100FD0300000200C70000000100750000000200B70000000100FD0300000200D303000003005904000004007A0500000100A602090085040100110085040600190085040A0031008504060061008504060099008504060039000D032800390041022C00A9004B063000A90043063300B1009C013800B10017013E00A9002C064300A900A9024900B9005F004D00B900FA045200B100B4025800B10031035D004900A301620049008E010600C10094010600B10082016A00C9001F035D00B90055026F00590085047500D10006012C003900BF057D00290085040600A90022018800D90077028E00390013039300A9004B02A600690085040600A9009705AC00E9001906B10069001001B700290091022C00F100A302C700F900A105280079008C05CE0001014203D400090141022C0011012A018E0019014C01E100F1002201E500F1002201EF0089000D032800890041022800A9000D0615010C0085042101290158012701140077044B011C00D7055A01690010015F013901F205280020002300E6012E000B00B5012E001300BE012E001B00DD0160002B00E60180002B00E60183003300E601A0002B00E601C0002B00E601E0002B00E601100083009700BD00DA00F6001A0144015401048000000000000000000000000000000000570500000400000000000000000000006501DE00000000000400000000000000000000006501D2000000000004000000000000000000000065014B030000000004000000000000000000000065015E0100000000040003006700400100000000003C3E395F5F345F30003C666E6352656375706572615F4C65747261735F53656D5F4361726163746572655F457370656369616C3E625F5F345F300049456E756D657261626C6560310049456E756D657261746F7260310046756E636032006765745F55544638003C3E39003C4D6F64756C653E0044735F537472696E675F48544D4C00666E63436F6469666963615F55524C0044735F55524C0053797374656D2E494F0044735F5265746F726E6F5F4F555450555400466C5F5175656272615F4C696E68610044735F4D6173636172610053797374656D2E44617461006D73636F726C6962003C3E630053797374656D2E436F6C6C656374696F6E732E47656E657269630052656164546F456E6400417070656E64007365745F4D6574686F64005265706C6163650048746D6C4465636F646500456E756D657261626C650049446973706F7361626C65006765745F4E65774C696E650057686572650053797374656D2E436F726500436170747572650048747470576562526573706F6E736500476574526573706F6E736500436C6F736500446973706F73650043726561746500577269746500436F6D70696C657247656E6572617465644174747269627574650044656275676761626C654174747269627574650053716C50726F6365647572654174747269627574650053716C46756E6374696F6E41747472696275746500436F6D70696C6174696F6E52656C61786174696F6E734174747269627574650052756E74696D65436F6D7061746962696C697479417474726962757465006765745F56616C7565004E6F726D616C697A6500476574456E636F64696E6700666E6352656D6F76655F48746D6C5F537472696E6700457363617065557269537472696E670053716C537472696E6700546F537472696E6700666E6352656765785F4D61746368006765745F4C656E677468007365745F436F6E74656E744C656E6774680055726900666E6352656375706572615F4C65747261735F53656D5F4361726163746572655F457370656369616C005765627363726170696E675F436F727265696F732E646C6C006765745F49734E756C6C0044735F55726C00476574526573706F6E736553747265616D004765745265717565737453747265616D006765745F4974656D0053797374656D004E6F726D616C697A6174696F6E466F726D0053716C426F6F6C65616E0053797374656D2E476C6F62616C697A6174696F6E0047726F7570436F6C6C656374696F6E00457863657074696F6E0044735F436F64696669636163616F00666E6352656D6F76655F4163656E74756163616F0073747057735F5265717569736963616F00466C5F416C66616E756D657269636F0044735F4D65746F646F0043686172556E69636F6465496E666F0044735F546578746F0047726F75700053797374656D2E4C696E710053747265616D526561646572005465787452656164657200537472696E674275696C646572004D6963726F736F66742E53716C5365727665722E5365727665720044735F537562737469747569725F506F720049456E756D657261746F7200476574456E756D657261746F72002E63746F72002E6363746F720053797374656D2E446961676E6F73746963730053797374656D2E52756E74696D652E436F6D70696C6572536572766963657300446562756767696E674D6F6465730053797374656D2E446174612E53716C54797065730053746F72656450726F636564757265730047657442797465730053797374656D2E546578742E526567756C617245787072657373696F6E730053797374656D2E436F6C6C656374696F6E730055736572446566696E656446756E6374696F6E730052656765784F7074696F6E73005765627363726170696E675F436F727265696F730044735F506172616D6574726F7300466C5F4163656974615F4163656E746F73006765745F47726F757073006765745F4368617273006765745F53756363657373004F626A6563740053797374656D2E4E6574006F705F496D706C6963697400456E7669726F6E6D656E74006765745F43757272656E74004874747057656252657175657374004D6F76654E6578740053797374656D2E5465787400526567657800546F43686172417272617900476574556E69636F646543617465676F7279006F705F457175616C697479005765625574696C6974790049734E756C6C4F72456D7074790000000000010007470045005400000B5500540046002D003800010950004F005300540000055C005C00000F68007400740070003A002F002F0000035C0000032F00000B3C00620072002F003E00000D3C006200720020002F003E0000093C00620072003E00000D26006E006200730070003B0000032000000B3C002E002A003F003E0000000000E8F81B1ABDCAAF498C6FFAE0314C37CE0004200101080320000105200101111117070F0E0E0E0E0E122102020E1D0512251229122502122D032000020320000E02060E040001020E05000112590E042001010E050002020E0E03200008040000125D0520011D050E042001010A0420001225072003011D0508080420001265050001125D0E072002011225125D050001111D0E040701111D0520020E0E0E0400010E0E0306111D0E07090E123502111D0E08031139020520010E1171042001030805000111390305200112350309070402111D123D1241060002123D0E0E052000128081052001127D080607040E020E020300000E0900040E0E0E0E1180910600030E0E0E0E1E0715020E021D03123502111D0215124901030302020202020202020202020420001D0306151251020302052002011C181810010215128099011E0015128099011E00151251021E0002030A010306151280990103082000151249011300051512490103042000130005200112350E08B77A5C561934E0890306121007061512510203020E000501111D111D111D111D10111D060001111D111D080002111D111D111D0500020E0E020C0004111D111D1145111D11450300000104200102030801000800000000001E01000100540216577261704E6F6E457863657074696F6E5468726F77730108010007010000000004010000000000000000B1B6045B00000000020000001C0100009C3500009C17000052534453A5E0CF413234D0468D51838C5CDECBDE01000000633A5C75736572735C646966696C5C646F63756D656E74735C76697375616C2073747564696F20323031375C50726F6A656374735C5765627363726170696E675F436F727265696F735C5765627363726170696E675F436F727265696F735C6F626A5C44656275675C5765627363726170696E675F436F727265696F732E7064620000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000E03600000000000000000000FA360000002000000000000000000000000000000000000000000000EC360000000000000000000000005F436F72446C6C4D61696E006D73636F7265652E646C6C0000000000FF2500200010000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001001000000018000080000000000000000000000000000001000100000030000080000000000000000000000000000001000000000048000000584000007C02000000000000000000007C0234000000560053005F00560045005200530049004F004E005F0049004E0046004F0000000000BD04EFFE00000100000000000000000000000000000000003F000000000000000400000002000000000000000000000000000000440000000100560061007200460069006C00650049006E0066006F00000000002400040000005400720061006E0073006C006100740069006F006E00000000000000B004DC010000010053007400720069006E006700460069006C00650049006E0066006F000000B801000001003000300030003000300034006200300000002C0002000100460069006C0065004400650073006300720069007000740069006F006E000000000020000000300008000100460069006C006500560065007200730069006F006E000000000030002E0030002E0030002E003000000052001900010049006E007400650072006E0061006C004E0061006D00650000005700650062007300630072006100700069006E0067005F0043006F0072007200650069006F0073002E0064006C006C00000000002800020001004C006500670061006C0043006F0070007900720069006700680074000000200000005A00190001004F0072006900670069006E0061006C00460069006C0065006E0061006D00650000005700650062007300630072006100700069006E0067005F0043006F0072007200650069006F0073002E0064006C006C0000000000340008000100500072006F006400750063007400560065007200730069006F006E00000030002E0030002E0030002E003000000038000800010041007300730065006D0062006C0079002000560065007200730069006F006E00000030002E0030002E0030002E0030000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000003000000C0000000C3700000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
    WITH PERMISSION_SET = EXTERNAL_ACCESS;


GO

PRINT N'Creating [dbo].[fncCodifica_URL]...';
GO

CREATE FUNCTION [dbo].[fncCodifica_URL]
(@Ds_URL NVARCHAR (MAX) NULL)
RETURNS NVARCHAR (MAX)
AS
 EXTERNAL NAME [Webscraping_Correios].[UserDefinedFunctions].[fncCodifica_URL]


GO
PRINT N'Creating [dbo].[fncRecupera_Letras_Sem_Caractere_Especial]...';

GO


CREATE FUNCTION [dbo].[fncRecupera_Letras_Sem_Caractere_Especial]
(@Ds_Texto NVARCHAR (MAX) NULL, @Fl_Alfanumerico BIT NULL, @Ds_Substituir_Por NVARCHAR (MAX) NULL, @Fl_Aceita_Acentos BIT NULL)
RETURNS NVARCHAR (MAX)
AS
 EXTERNAL NAME [Webscraping_Correios].[UserDefinedFunctions].[fncRecupera_Letras_Sem_Caractere_Especial]


GO
PRINT N'Creating [dbo].[fncRegex_Match]...';
GO

CREATE FUNCTION [dbo].[fncRegex_Match]
(@Ds_Texto NVARCHAR (MAX) NULL, @Ds_Mascara NVARCHAR (MAX) NULL)
RETURNS NVARCHAR (MAX)
AS
 EXTERNAL NAME [Webscraping_Correios].[UserDefinedFunctions].[fncRegex_Match]


GO
PRINT N'Creating [dbo].[fncRemove_Acentuacao]...';


GO

CREATE FUNCTION [dbo].[fncRemove_Acentuacao]
(@Ds_Texto NVARCHAR (MAX) NULL)
RETURNS NVARCHAR (MAX)
AS
 EXTERNAL NAME [Webscraping_Correios].[UserDefinedFunctions].[fncRemove_Acentuacao]


GO
PRINT N'Creating [dbo].[fncRemove_Html_String]...';


GO

CREATE FUNCTION [dbo].[fncRemove_Html_String]
(@Ds_String_HTML NVARCHAR (MAX) NULL, @Fl_Quebra_Linha BIT NULL)
RETURNS NVARCHAR (MAX)
AS
 EXTERNAL NAME [Webscraping_Correios].[UserDefinedFunctions].[fncRemove_Html_String]


GO
PRINT N'Creating [dbo].[stpWs_Requisicao]...';
GO


CREATE PROCEDURE [dbo].[stpWs_Requisicao]
@Ds_Url NVARCHAR (MAX) NULL, @Ds_Metodo NVARCHAR (MAX) NULL, @Ds_Parametros NVARCHAR (MAX) NULL, @Ds_Codificacao NVARCHAR (MAX) NULL, @Ds_Retorno_OUTPUT NVARCHAR (MAX) NULL OUTPUT
AS EXTERNAL NAME [Webscraping_Correios].[StoredProcedures].[stpWs_Requisicao]
GO

PRINT N'Update complete.';

Código-fonte da stpBusca_Agencias_Correios (T-SQL):
Visualizar código-fonte

USE [dirceuresende]
GO

IF (OBJECT_ID('dbo.stpBusca_Agencias_Correios') IS NULL) EXEC('CREATE PROCEDURE dbo.stpBusca_Agencias_Correios AS SELECT 1')
GO

ALTER PROCEDURE dbo.stpBusca_Agencias_Correios (
    @Ds_Cidade VARCHAR(50) = 'Vitória',
    @Sg_UF CHAR(2) = 'ES'
)
AS BEGIN


    -- DECLARE @Ds_Cidade VARCHAR(50) = 'Vitória', @Sg_UF CHAR(2) = 'ES'

    DECLARE 
        @Ds_Retorno_OUTPUT NVARCHAR(MAX),
        @Url VARCHAR(MAX)


    SET @Sg_UF = UPPER(@Sg_UF)
    SET @Ds_Cidade = dbo.fncCodifica_URL(UPPER(dbo.fncRemove_Acentuacao(@Ds_Cidade)))

    
    SET @Url = 'http://www2.correios.com.br/sistemas/agencias/inc/divDetalheAgencia.cfm?uf=' + @Sg_UF + '&municipio=' + @Ds_Cidade + '%20%20%20%20%20%20%20&bairro=&horario=&atendimento=&_cf_containerId=divDetalheAgencia&_cf_nodebug=true&_cf_nocache=true&_cf_clientid=D5982645C0BADD54FDD972B4246C6C48&_cf_rc=30'

    
    EXEC dbo.stpWs_Requisicao
        @Ds_Url = @Url,                                 -- nvarchar(max)
        @Ds_Metodo = N'GET',                              -- nvarchar(max)
        @Ds_Parametros = N'',                          -- nvarchar(max)
        @Ds_Codificacao = N'iso-8859-1',                         -- nvarchar(max)
        @Ds_Retorno_OUTPUT = @Ds_Retorno_OUTPUT OUTPUT -- nvarchar(max)

    
    DECLARE @Contador INT = 1, @Total INT = LEN(@Ds_Retorno_OUTPUT), @Resultado VARCHAR(MAX) = '', @String VARCHAR(MAX), @contaLinhas INT = 0, @StringSemHTML VARCHAR(MAX)
    DECLARE @Retorno TABLE ( id INT IDENTITY(1,1) NOT NULL, texto VARCHAR(MAX) NOT NULL, ranking INT NOT NULL, id_grupo INT NULL )

    WHILE(@Resultado IS NOT NULL)
    BEGIN
    
        SET @String = SUBSTRING(@Ds_Retorno_OUTPUT, @Contador, @Total)
        SET @Resultado = dbo.fncRegex_Match(@String, '<td .*?>(.*?)</td>')
        SET @StringSemHTML = LTRIM(RTRIM(dbo.fncRemove_Html_String(@Resultado, 1)))
        SET @StringSemHTML = NULLIF(LTRIM(RTRIM(dbo.fncRecupera_Letras_Sem_Caractere_Especial(@StringSemHTML, 1, '', 1) COLLATE SQL_Latin1_General_CP1_CI_AI)), '')

        IF (LOWER(@Resultado) LIKE '%ncia%correio%' OR LOWER(@Resultado) LIKE '%ncia%filatelica%' OR LOWER(@Resultado) LIKE '%agf %')
            SET @contaLinhas = 1 
        ELSE 
            IF (@StringSemHTML IS NOT NULL AND @StringSemHTML <> 'LOCALIZAÇÃO')
                SET @contaLinhas += 1


        INSERT INTO @Retorno(texto, ranking)
        SELECT @StringSemHTML, @contaLinhas
        WHERE @StringSemHTML IS NOT NULL
        AND @StringSemHTML NOT IN ('Sábado', 'Domingo', 'Horário de almoço', 'Plantão no Sábado', 'Plantão no Domingo', 'HORÁRIO DE ATENDIMENTO', 'LOCALIZAÇÃO')
        AND @StringSemHTML NOT LIKE 'Segunda a sexta%'
        AND @StringSemHTML NOT LIKE 'Sábado%'


        SET @Contador = CHARINDEX(@Resultado, @Ds_Retorno_OUTPUT, @Contador) + LEN(@Resultado)


    END


    DECLARE @id_grupo INT = 0

    SET @Contador = 1
    SET @Total = (SELECT COUNT(*) FROM @Retorno)

    WHILE(@Contador <= @Total)
    BEGIN
        

        IF ((SELECT TOP(1) ranking FROM @Retorno WHERE id = @Contador ORDER BY id) = 1)
            SET @id_grupo += 1


        UPDATE @Retorno
        SET id_grupo = @id_grupo
        WHERE id = @Contador


        SET @Contador += 1

    END

    
    DECLARE @Tabela_Final TABLE (
        Ds_Agencia VARCHAR(50) NOT NULL,
        Ds_Tipo_Agencia VARCHAR(50) NOT NULL,
        Ds_Endereco VARCHAR(100) NOT NULL,
        Ds_Complemento VARCHAR(100) NULL,
        Ds_Bairro VARCHAR(100) NOT NULL,
        Ds_Municipio VARCHAR(100) NOT NULL,
        Ds_Cidade VARCHAR(100) NOT NULL,
        Sg_UF VARCHAR(2) NOT NULL,
        Ds_CEP VARCHAR(10) NOT NULL
    )


    SET @Contador = 1
    SET @Total = (SELECT COUNT(DISTINCT id_grupo) FROM @Retorno)

    WHILE(@Contador <= @Total)
    BEGIN
        

        INSERT INTO @Tabela_Final
        SELECT 
            MAX(CASE WHEN ranking = 1 THEN LTRIM(RTRIM(LEFT(texto, 32))) ELSE NULL END) AS Ds_Agencia,
            MAX(CASE WHEN ranking = 1 THEN LTRIM(RTRIM(SUBSTRING(texto, 32, 9999))) ELSE NULL END) AS Ds_Tipo_Agencia,
            MAX(CASE WHEN ranking = 2 THEN LTRIM(RTRIM(REPLACE(texto, 'Endereço ', ''))) ELSE NULL END) AS Ds_Endereco,
            MAX(CASE WHEN ranking = 3 THEN NULLIF(LTRIM(RTRIM(REPLACE(texto, 'Complemento', ''))), '') ELSE NULL END) AS Ds_Complemento,
            MAX(CASE WHEN ranking = 4 THEN LTRIM(RTRIM(REPLACE(texto, 'Bairro ', ''))) ELSE NULL END) AS Ds_Bairro,
            MAX(CASE WHEN ranking = 5 THEN LTRIM(RTRIM(REPLACE(texto, 'Município ', ''))) ELSE NULL END) AS Ds_Municipio,
            MAX(CASE WHEN ranking = 6 THEN LTRIM(RTRIM(REPLACE(texto, 'Cidade ', ''))) ELSE NULL END) AS Ds_Cidade,
            MAX(CASE WHEN ranking = 7 THEN LTRIM(RTRIM(REPLACE(texto, 'UF ', ''))) ELSE NULL END) AS Sg_UF,
            MAX(CASE WHEN ranking = 8 THEN LTRIM(RTRIM(REPLACE(texto, 'CEP ', ''))) ELSE NULL END) AS Ds_CEP
        FROM 
            @Retorno
        WHERE
            id_grupo = @Contador


        SET @Contador += 1


    END


    SELECT * FROM @Tabela_Final
   

END


/*

EXEC dbo.stpBusca_Agencias_Correios
    @Ds_Cidade = 'Vitória', -- varchar(50)
    @Sg_UF = 'ES'      -- char(2)

*/

Código-fonte dos objetos SQLCLR

Acima, eu disponibilizei o fonte para a criação do assembly já pronto com as funções do SQLCLR e também os objetos de banco de dados (funções e Stored Procedure) para utilizar o assembly. Isso facilita muito a utilização desses objetos e não requer conhecimentos em C# ou ter o Visual Studio na sua máquina para compilar a DLL do SQLCLR.

Caso você seja um cara que goste de conhecer tudo a fundo e entender o código do SQLCLR, vou disponibilizar aqui também o código original desses objetos, para você incluir no seu projeto de SQLCLR, por exemplo.

Se você não conhece o SQLCLR, não pode deixar de ler os artigos abaixo:
Introdução ao SQL CLR (Common Language Runtime) no SQL Server
SQL Server – Comparação de performance entre Scalar Function e CLR Scalar Function
Outros posts sobre SQLCLR

Código-fonte da Procedure stpWs_Requisicao (C#):
Visualizar código-fonte

using System.Data.SqlTypes;
using System.Net;
using System.IO;
using System.Text;

public partial class StoredProcedures
{
    [Microsoft.SqlServer.Server.SqlProcedure]
    public static void stpWs_Requisicao(SqlString Ds_Url, SqlString Ds_Metodo, SqlString Ds_Parametros, SqlString Ds_Codificacao, out SqlString Ds_Retorno_OUTPUT)
    {

        var parametros = (Ds_Parametros.IsNull) ? "" : Ds_Parametros.Value;
        var metodo = (Ds_Metodo.IsNull) ? "GET" : Ds_Metodo.Value;
        var url = Ds_Url.Value;
        var feedData = string.Empty;
        var encoding = (Ds_Codificacao.IsNull) ? "UTF-8" : Ds_Codificacao.Value;

        if (string.IsNullOrEmpty(encoding))
            encoding = "UTF-8";

        var request = (HttpWebRequest) WebRequest.Create(url);
        request.Method = metodo;
        
        if (metodo == "POST" && parametros.Length > 0)
        {
            var data = parametros;
            var dataStream = Encoding.UTF8.GetBytes(data);

            request.ContentLength = dataStream.Length;

            using (var newStream = request.GetRequestStream())
            {
                newStream.Write(dataStream, 0, dataStream.Length);
                newStream.Close();
            }
        }


        using (var response = (HttpWebResponse) request.GetResponse())
        {
            using (var stream = response.GetResponseStream())
            {
                if (stream != null)
                {
                    using (var streamReader = new StreamReader(stream, Encoding.GetEncoding(encoding)))
                    {
                        feedData = streamReader.ReadToEnd();
                    }
                }
            }
        }

        Ds_Retorno_OUTPUT = feedData;

    }

};

Código-fonte da fncCodifica_URL (C#):
Visualizar código-fonte

using System;
using System.Data.SqlTypes;

public partial class UserDefinedFunctions
{
    [Microsoft.SqlServer.Server.SqlFunction]
    public static SqlString fncCodifica_URL(SqlString Ds_URL)
    {
        return Ds_URL.IsNull ? SqlString.Null : Uri.EscapeUriString(Ds_URL.Value.Replace("\\\\", "http://").Replace("\\", "/"));
    }
}

Código-fonte da fncRemove_Acentuacao (C#):
Visualizar código-fonte

using System.Text;
using System.Data.SqlTypes;
using System.Globalization;

public partial class UserDefinedFunctions
{
    [Microsoft.SqlServer.Server.SqlFunction]
    public static SqlString fncRemove_Acentuacao(SqlString Ds_Texto)
    {

        if (Ds_Texto.IsNull)
            return null;

        var s = Ds_Texto.Value.Normalize(NormalizationForm.FormD);
        var sb = new StringBuilder();

        foreach (var t in s)
        {
            var uc = CharUnicodeInfo.GetUnicodeCategory(t);
            if (uc != UnicodeCategory.NonSpacingMark)
            {
                sb.Append(t);
            }
        }

        return sb.ToString();

    }
};

Código-fonte da fncRegex_Match (C#):
Visualizar código-fonte

using System;
using System.Data.SqlTypes;
using System.Text.RegularExpressions;

public partial class UserDefinedFunctions
{
    [Microsoft.SqlServer.Server.SqlFunction]
    public static SqlString fncRegex_Match(SqlString Ds_Texto, SqlString Ds_Mascara)
    {

        if (Ds_Texto.IsNull || Ds_Mascara.IsNull)
            return SqlString.Null;

        try
        {
            var resultado = Regex.Match(Ds_Texto.Value, Ds_Mascara.Value);
            return resultado.Success ? resultado.Groups[0].Value : SqlString.Null;
        }
        catch (Exception e)
        {
            return SqlString.Null;
        }

    }
}

Código-fonte da fncRemove_Html_String (C#):
Visualizar código-fonte

using System;
using System.Net;
using System.Text.RegularExpressions;

public partial class UserDefinedFunctions
{
    [Microsoft.SqlServer.Server.SqlFunction]
    public static string fncRemove_Html_String(string Ds_String_HTML, bool Fl_Quebra_Linha)
    {

        if (string.IsNullOrEmpty(Ds_String_HTML))
            return null;


        var html = Ds_String_HTML;
        html = WebUtility.HtmlDecode(html);

        if (Fl_Quebra_Linha)
        {
            html = Regex.Replace(html, "<br/>", Environment.NewLine, RegexOptions.IgnoreCase);
            html = Regex.Replace(html, "<br />", Environment.NewLine, RegexOptions.IgnoreCase);
            html = Regex.Replace(html, "<br>", Environment.NewLine, RegexOptions.IgnoreCase);
        }

        html = Regex.Replace(html, " ", " ", RegexOptions.IgnoreCase);
        html = Regex.Replace(html, "<.*?>", string.Empty);

        return html;
    }
}

Código-fonte da fncRecupera_Letras_Sem_Caractere_Especial (C#):
Visualizar código-fonte

using System.Data.SqlTypes;
using System.Globalization;
using System.Linq;
using System.Text;

public partial class UserDefinedFunctions
{
    [Microsoft.SqlServer.Server.SqlFunction]
    public static SqlString fncRecupera_Letras_Sem_Caractere_Especial(SqlString Ds_Texto, SqlBoolean Fl_Alfanumerico, SqlString Ds_Substituir_Por, SqlBoolean Fl_Aceita_Acentos)
    {

        if (Ds_Texto.IsNull)
            return SqlString.Null;

        if (Fl_Alfanumerico.IsNull)
            return SqlString.Null;

        var aceitaAcentos = (!Fl_Aceita_Acentos.IsNull) && Fl_Aceita_Acentos.Value;
        var substituirPor = (Ds_Substituir_Por.IsNull) ? "" : Ds_Substituir_Por.Value;
        var somenteAlfanumerico = Fl_Alfanumerico.Value;
        var normalizedArray = aceitaAcentos ? Ds_Texto.Value.ToCharArray() : Ds_Texto.Value.Normalize(NormalizationForm.FormD).ToCharArray();
        var builder = new StringBuilder();

        foreach (var ch in normalizedArray.Where(ch => CharUnicodeInfo.GetUnicodeCategory(ch) != UnicodeCategory.NonSpacingMark))
        {

            if (somenteAlfanumerico)
            {

                if (aceitaAcentos)
                {

                    if ((ch >= 65 && ch <= 90) || (ch >= 97 && ch <= 122) || (ch >= 48 && ch <= 57) || ch == 32 || (ch >= 192 && ch <= 207) || (ch >= 209 && ch <= 214) || (ch >= 217 && ch <= 221) || (ch >= 224 && ch <= 228) || (ch >= 231 && ch <= 239) || (ch >= 241 && ch <= 246) || (ch >= 249 && ch <= 253))
                        builder.Append(ch);
                    else if (substituirPor.Length > 0)
                        builder.Append(substituirPor);

                }
                else
                {

                    if ((ch >= 65 && ch <= 90) || (ch >= 97 && ch <= 122) || (ch >= 48 && ch <= 57) || ch == 32)
                        builder.Append(ch);
                    else if (substituirPor.Length > 0)
                        builder.Append(substituirPor);

                }
            }
            else
            {

                if (aceitaAcentos)
                {

                    if ((ch >= 32 && ch <= 126) || (ch >= 192 && ch <= 207) || (ch >= 209 && ch <= 214) || (ch >= 217 && ch <= 221) || (ch >= 224 && ch <= 228) || (ch >= 231 && ch <= 239) || (ch >= 241 && ch <= 246) || (ch >= 249 && ch <= 253))
                        builder.Append(ch);
                    else if (substituirPor.Length > 0)
                        builder.Append(substituirPor);

                }
                else
                {

                    if (ch >= 32 && ch <= 126)
                        builder.Append(ch);
                    else if (substituirPor.Length > 0)
                        builder.Append(substituirPor);

                }
            }
        }
        
        return builder.ToString();

    }
}

É isso aí, pessoal!
Espero que tenham gostado desse post e até a próxima!

Abraço!