python爬虫 慢慢买历史低价爬取 js逆向入门 |
您所在的位置:网站首页 › 逆向入门网站 › python爬虫 慢慢买历史低价爬取 js逆向入门 |
【本文仅供学习,请勿用于非法用途,若非法使用概不负责】
前话:
最近计划从Java转向爬虫,假期闲来无事想着找点事情做,于是就想着能不能把慢慢买的商品历史价格爬下来。(PS:作者平时购物喜欢使用慢慢买查看历史价格,不过用的是app :-) 正文首先使用谷歌浏览器打开慢慢买网页,F12然后随意点击一个商品,查看接口: emmm,参数挺简单的,只有一个cxid咱不知道,于是我看了一下网址:http://cu.manmanbuy.com/discuxiao_4076382.aspx。 emmm,参数这就齐全了?我马上打开Pycharm编写了如下代码: import requests resp = requests.get('http://tool.manmanbuy.com/history.aspx?DA=1&action=gethistory&url=&bjid=&spbh=&cxid=4076382&zkid=&w=310&token=') print(resp.text)输出: {"datePrice":"[1602518400000,20.90,\"\"],[1602604800000,20.90,\"\"],[1602691200000,20.90,\"\"],[1602777600000,20.90,\"\"],[1602864000000,20.90,\"\"],[1602950400000,20.90,\"\"],[1603036800000,20.90,\"\"],[1603123200000,20.90,\"\"],[1603209600000,20.90,\"\"],[1603296000000,20.90,\"\"],[1603382400000,20.90,\"\"],[1603468800000,20.90,\"\"],[1603555200000,20.90,\"\"],[1603641600000,20.90,\"\"],[1603728000000,20.90,\"\"],[1603814400000,20.90,\"\"],[1603900800000,17.9000,\"\"],[1603987200000,17.90,\"\"],[1604073600000,17.90,\"\"],[1604160000000,17.90,\"\"],[1604246400000,17.90,\"\"],[1604332800000,17.90,\"\"],[1604419200000,17.90,\"\"],[1604505600000,17.90,\"\"],[1604592000000,17.90,\"\"],[1604678400000,17.90,\"\"],[1604764800000,17.90,\"\"],[1604851200000,17.90,\"\"],[1604937600000,17.90,\"\"],[1605024000000,17.90,\"\"],[1605110400000,17.90,\"\"],[1605196800000,17.90,\"\"],[1605283200000,17.90,\"\"],[1605369600000,17.90,\"\"],[1605456000000,17.90,\"\"],[1605542400000,17.90,\"\"],[1605628800000,17.90,\"\"],[1605715200000,17.90,\"\"],[1605801600000,17.90,\"\"],[1605888000000,17.90,\"\"],[1605974400000,17.90,\"\"],[1606060800000,17.90,\"\"],[1606147200000,17.90,\"\"],[1606233600000,17.90,\"\"],[1606320000000,17.90,\"\"],[1606406400000,17.90,\"\"],[1606492800000,17.90,\"\"],[1606579200000,10.90,\"购买1件,当前价:17.90,优惠券:满17减7\"],[1606665600000,10.90,\"购买1件,当前价:17.90,优惠券:满17减7\"],[1606752000000,10.90,\"购买1件,当前价:17.90,优惠券:满17减7\"],[1606838400000,15.36,\"购买5件,当前价:16.80,满减:满3件,打9.5折,优惠券:满69减3\"],[1606924800000,10.90,\"购买1件,当前价:17.90,优惠券:满17减7\"],[1607011200000,10.90,\"购买1件,当前价:17.90,优惠券:满17减7\"],[1607097600000,10.90,\"购买1件,当前价:17.90,优惠券:满17减7\"],[1607184000000,10.90,\"购买1件,当前价:17.90,优惠券:满17减7\"],[1607270400000,10.90,\"购买1件,当前价:17.90,优惠券:满17减7\"],[1607356800000,10.90,\"购买1件,当前价:17.90,优惠券:满17减7\"],[1607443200000,10.90,\"购买1件,当前价:17.90,优惠券:满17减7\"],[1607529600000,16.8,\"京东秒杀价:16.8\"],[1607616000000,10.90,\"购买1件,当前价:17.90,优惠券:满17减7\"],[1607702400000,10.90,\"购买1件,当前价:17.90,优惠券:满17减7\"],[1607788800000,10.90,\"购买1件,当前价:17.90,优惠券:满17减7\"],[1607875200000,10.90,\"购买1件,当前价:17.90,优惠券:满17减7\"],[1607961600000,10.90,\"购买1件,当前价:17.90,优惠券:满17减7\"],[1608048000000,16.40,\"购买4件,当前价:17.90,满减:满49减3,优惠券:满69减3\"],[1608134400000,16.40,\"购买4件,当前价:17.90,满减:满49减3,优惠券:满69减3\"],[1608220800000,10.90,\"购买1件,当前价:17.90,优惠券:满17减7\"],[1608307200000,10.90,\"购买1件,当前价:17.90,优惠券:满17减7\"],[1608393600000,10.90,\"购买1件,当前价:17.90,优惠券:满17减7\"],[1608480000000,10.90,\"购买1件,当前价:17.90,优惠券:满17减7\"],[1608566400000,10.90,\"购买1件,当前价:17.90,优惠券:满17减7\"],[1608652800000,10.90,\"购买1件,当前价:17.90,优惠券:满17减7\"],[1608739200000,10.90,\"购买1件,当前价:17.90,优惠券:满17减7\"],[1608825600000,10.90,\"购买1件,当前价:17.90,优惠券:满17减7\"],[1608912000000,15.60,\"购买5件,plus价格16.8,满减:满49减3,优惠券:满69减3\"],[1608998400000,15.60,\"购买5件,plus价格16.8,满减:满49减3,优惠券:满69减3\"],[1609084800000,15.60,\"购买5件,plus价格16.8,满减:满49减3,优惠券:满69减3\"],[1609171200000,16.40,\"购买4件,当前价:17.90,满减:满49减3,优惠券:满69减3\"],[1609257600000,16.40,\"购买4件,当前价:17.90,满减:满49减3,优惠券:满69减3\"],[1609344000000,16.40,\"购买4件,当前价:17.90,满减:满49减3,优惠券:满69减3\"],[1609430400000,17.90,\"\"],[1609516800000,17.90,\"\"],[1609603200000,17.90,\"\"],[1609689600000,16.40,\"购买4件,当前价:17.90,满减:满49减3,优惠券:满69减3\"],[1609776000000,16.40,\"购买4件,当前价:17.90,满减:满49减3,优惠券:满69减3\"],[1609862400000,16.40,\"购买4件,当前价:17.90,满减:满49减3,优惠券:满69减3\"], ...因内容太长截取得时候省略了很多,对比了一下页面的信息可以发现,返回的dataPrice字段里的列表第一个是日期时间戳,第二个是价格,第三个是满减信息。 我心想这就爬下来了?于是我就去搜索多找了几个商品实验了一下: 竟然还有这样的页面,而且调的接口不是同一个?WF?还有这种操作,于是我又看了看参数 首先在左侧F12按下CTRL+F打开搜索框,搜索token(毕竟是token参数,js里总不能没有这个字符串吧) 到这里就把整个token获取流程搞定了,激动的我赶紧写了代码去测试一下: # parse_req上面写了这里就不写了 req = parse_req({'key': 'https://detail.tmall.com/item.htm?id=638265162028', 'method': 'getHistoryTrend'}) headers = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36', 'Accept-Language': 'zh-CN,zh;q=0.9', 'Content-Type': 'application/x-www-form-urlencoded; charset=UTF-8', 'Cookie': 'ASP.NET_SessionId=k55tyozvlmbcmbiwm1yn0ng1; Hm_lvt_85f48cee3e51cd48eaba80781b243db3=1641120605; Hm_lvt_01a310dc95b71311522403c3237671ae=1641120605; _ga=GA1.2.1386971910.1641120645; _gid=GA1.2.1891960103.1641120645; Hm_lpvt_85f48cee3e51cd48eaba80781b243db3=1641186149; Hm_lpvt_01a310dc95b71311522403c3237671ae=1641186149', 'sec-ch-ua': '" Not A;Brand";v="99", "Chromium";v="96", "Google Chrome";v="96"', 'sec-ch-ua-mobile': '?0', 'sec-ch-ua-platform': 'Windows', 'Sec-Fetch-Mode': 'cors', 'Sec-Fetch-Dest': 'empty', 'Sec-Fetch-Site': 'same-origin', 'Accept-Encoding': 'gzip, deflate, br', 'Sec-Fetch-User': '?1', 'Connection': 'keep-alive', 'Host': 'tool.manmanbuy.com', 'Origin': 'https://tool.manmanbuy.com', 'Referer': 'https://tool.manmanbuy.com/HistoryLowest.aspx?url=https%3a%2f%2fitem.jd.com%2f100011493273.html', } print(headers) print(req) resp = requests.post(f'http://tool.manmanbuy.com/api.ashx', data=req, headers=headers) print(resp.status_code) print(resp.text)没想到竟然给我返回这个 {"msg":"无效的票据","code":1,"data":null,"count":0}WHAT?难道还有隐藏参数?我又去翻了一遍请求头,发现竟然还有个Authorization参数 本地函数,跳转过去是求和函数(个人习惯把function(a,b) return a+b;的函数叫做求和函数,实际上可能是个字符串拼接,刚刚上面token也是同理),继续往后: 这下调用接口终于能拿到正常结果了 {"msg":"","code":0,"data":{"haveTrend":1,"changPriceRemark":"降幅4%","runtime":59,"zouShi_test":2,"changePriceCount":32,"spbh":"10|638265162028","spUrl":"https://item.taobao.com/item.htm?id=638265162028","spPic":"https://img.alicdn.com/bao/uploaded/i3/260808543/O1CN01wH0pMP2CykFkX6kO5_!!0-item_pic.jpg","currentPrice":133.0,"spName":"口红礼盒套装大牌正品全套盒化妆国风雕花小众彩妆生日礼物送女友","lowerDate":"2021-06-06T00:00:00","lowerPrice":116.33,"bjid":464502722,"zouShi":1,"siteId":10,"siteName":"天猫商城","datePrice":"[1613664000000,188.00,\"\"],[1613750400000,188.00,\"\"],[1613836800000,188.00,\"\"],[1613923200000,188.00,\"\"],[1614009600000,188.00,\"\"],[1614096000000,188.00,\"\"],[1614182400000,188.00,\"\"],[1614268800000,188.00,\"\"],[1614355200000,188.00,\"\"],[1614441600000,188.00,\"\"],[1614528000000,188.00,\"\"],[1614614400000,188.00,\"\"],[1614700800000,188.00,\"\"],[1614787200000,188.00,\"\"],[1614873600000,188.00,\"\"],[1614960000000,188.00,\"\"],[1615046400000,188.00,\"\"],[1615132800000,188.00,\"\"],[1615219200000,188.00,\"\"],[1615305600000,188.00,\"\"],[1615392000000,188.00,\"\"],[1615478400000,188.00,\"\"],[1615564800000,188.00,\"\"],[1615651200000,188.00,\"\"],[1615737600000,188.00,\"\"],[1615824000000,188.00,\"\"],[1615910400000,188.00,\"\"],[1615996800000,188.00,\"\"],[1616083200000,188.00,\"\"],[1616169600000,188.00,\"\"],[1616256000000,188.00,\"\"],[1616342400000,188.00,\"\"],[1616428800000,138.00,\"\"],[1616515200000,138.00,\"\"],[1616601600000,138.00,\"\"],[1616688000000,138.00,\"\"],[1616774400000,158.00,\"\"],[1616860800000,158.00,\"\"],[1616947200000,158.00,\"\"],[1617033600000,158.00,\"\"],[1617120000000,158.00,\"\"],[1617206400000,158.00,\"\"],[1617292800000,158.00,\"\"],[1617379200000,158.00,\"\"],[1617465600000,158.00,\"\"],[1617552000000,158.00,\"\"],[1617638400000,158.00,\"\"],[1617724800000,158.00,\"\"],[1617811200000,158.00,\"\"],[1617897600000,158.00,\"\"],[1617984000000,158.00,\"\"],[1618070400000,158.00,\"\"],[1618156800000,158.00,\"\"],[1618243200000,158.00,\"\"],[1618329600000,158.00,\"\"],[1618416000000,158.00,\"\"],[1618502400000,158.00,\"\"],[1618588800000,158.00,\"\"],[1618675200000,158.00,\"\"],[1618761600000,158.00,\"\"],[1618848000000,158.00,\"\"],[1618934400000,158.00,\"\"],[1619020800000,158.00,\"\"],[1619107200000,158.00,\"\"],[1619193600000,138.0000,\"\"],[1619280000000,138.00,\"\"],[1619366400000,138.00,\"\"],[1619452800000,138.00,\"\"],[1619539200000,138.00,\"\"],[1619625600000,138.00,\"\"],[1619712000000,138.00,\"\"],[1619798400000,138.00,\"\"],[1619884800000,138.00,\"\"],[1619971200000,138.00,\"\"],[1620057600000,138.00,\"\"],[1620144000000,138.00,\"\"],[1620230400000,138.00,\"\"],[1620316800000,138.00,\"\"],[1620403200000,138.00,\"\"],[1620489600000,138.00,\"\"],[1620576000000,138.00,\"\"],[1620662400000,138.00,\"\"],[1620748800000,138.00,\"\"],[1620835200000,138.00,\"\"],[1620921600000,138.00,\"\"],[1621008000000,138.00,\"\"],[1621094400000,138.00,\"\"],[1621180800000,138.00,\"\"],[1621267200000,138.00,\"\"],[1621353600000,133.0,\"购买1件,当前价:138.0,优惠券:满99减5\"],[1621440000000,133.0,\"购买1件,当前价:138.0,优惠券:满99减5\"],[1621526400000,133.0,\"购买1件,当前价:138.0,优惠券:满99减5\"],[1621612800000,133.0,\"购买1件,当前价:138.0,优惠券:满99减5\"],[1621699200000,133.0,\"购买1件,当前价:138.0,优惠券:满99减5\"],[1621785600000,133.0,\"购买1件,当前价:138.0,优惠券:满99减5\"],[1621872000000,133.0,\"购买1件,当前价:138.0,优惠券:满99减5\"],[1621958400000,133.0,\"购买1件,当前价:138.0,优惠券:满99减5\"],[1622044800000,133.0,\"购买1件,当前价:138.0,优惠券:满99减5\"],[1622131200000,133.0,\"购买1件,当前价:138.0,优惠券:满99减5\"],[1622217600000,133.0,\"购买1件,当前价:138.0,优惠券:满99减5\"],[1622304000000,133.0,\"购买1件,当前价:138.0,优惠券:满99减5\"],[1622390400000,133.0,\"购买1件,当前价:138.0,优惠券:满99减5\"],[1622476800000,133.0,\"购买1件,当前价:138.0,优惠券:满99减5\"],[1622563200000,133.0,\"购买1件,当前价:138.0,优惠券:满99减5\"],[1622649600000,138.0,\"\"],[1622736000000,116.33,\"购买3件,当前价:138.0,可叠加满减:每满200减30,优惠券:满99减5\"],[1622822400000,116.33,\"购买3件,当前价:138.0,可叠加满减:每满200减30,优惠券:满99减5\"],[1622908800000,116.33,\"购买3件,当前价:138.0,可叠加满减:每满200减30,优惠券:满99减5\"],[1622995200000,133.0,\"购买1件,当前价:138.0,优惠券:满99减5\"],[1623081600000,133.0,\"购买1件,当前价:138.0,优惠券:满99减5\"],[1623168000000,133.0,\"购买1件,当前价:138.0,优惠券:满99减5\"]为了方便也只复制了一部分,想要尝试效果的小伙伴可以去试试。 这篇文章就讲到这里啦,喜欢的帮忙点下攒和关注,谢谢大家。 最后再强调下哈,写这么多仅用于js逆向学习和分享,请勿用于不正当的用途哈~ |
今日新闻 |
推荐新闻 |
CopyRight 2018-2019 办公设备维修网 版权所有 豫ICP备15022753号-3 |