设计用于Web分析电子商务网站的仪表板。 第3部分:SEO频道

在本文中,我们将收集用于SEO流量分析的仪表板。 我们将通过python脚本和.csv文件卸载数据。

我们将卸载什么?


要分析搜索短语位置的动态,您需要从Yandex.WebmasterGoogle Search Console中卸载。 为了评估泵送搜索词组位置的“有用性”,频率数据将非常有用。 可以从Yandex.DirectGoogle Ads获得 。 好吧,为了分析网站技术方面的行为,我们将使用Page Speed Insider


SEO流量动态

Google搜索控制台


为了与API进行交互,我们将使用searchconsole库。 github详细描述了如何获取登录所需的令牌。 上载数据并将其加载到MS SQL数据库的过程如下:

def google_reports(): #     account = searchconsole.authenticate(client_config=r'credentials.json' , credentials=r'cred_result.json') webproperty = account['https://test.com/'] #        , ,      report = webproperty.query.range('today', days=-10).dimension('country', 'device', 'page', 'query').get() now = datetime.now() fr = now - timedelta(days = 9) to = now - timedelta(days = 3) res = pd.DataFrame(columns=['dt_from', 'dt_to', 'country', 'device', 'page', 'query', 'clicks', 'impressions', 'position']) #    DataFrame      for i in report.rows: temp={} temp['country'] = i[0] temp['device'] = i[1] temp['page'] = i[2] temp['query'] = i[3] temp['clicks'] = i[4] temp['impressions'] = i[5] temp['position'] = i[7] temp['dt_from'] = fr.strftime("%Y-%m-%d") temp['dt_to'] = to.strftime("%Y-%m-%d") res = res.append(temp, ignore_index=True) to_sql_server(res, 'google_positions') 

Yandex网站管理员


不幸的是,网站管理员只能上传500个搜索词组。 根据国家/地区,设备类型等上传剪辑 他也不能。 由于这些限制,除了要从网站站长上载500个单词的位置之外,我们还将数据从Yandex.Metrica上载到目标网页。 对于那些搜索短语不多的人,500个单词就足够了。 如果根据Yandex的语义核心足够广泛,则您将不得不从其他来源卸载职位或编写职位解析器。

 def yandex_reports(): token = "..." #  UserID url = "https://api.webmaster.yandex.net/v4/user/" headers = {"Authorization": "OAuth " + token} res = requests.get(url, headers=headers) a = json.loads(res.text) userId = a['user_id'] host_id = "https:test.com:443" #  500        res = requests.get(url+str(userId)+"/hosts/"+host_id+"/search-queries/popular/?order_by=TOTAL_SHOWS&query_indicator=TOTAL_SHOWS", headers=headers) df1 = pd.DataFrame(columns=['query_id', 'query_text', 'shows', 'dt_from', 'dt_to']) a = json.loads(res.text) for i in a['queries']: temp={} temp['query_id'] = i['query_id'] temp['query_text'] = i['query_text'] temp['shows'] = i['indicators']['TOTAL_SHOWS'] temp['query_text'] = i['query_text'] temp['dt_from'] = a['date_from'] temp['dt_to'] = a['date_to'] df1 = df1.append(temp, ignore_index=True) #  500        res = requests.get(url+str(userId)+"/hosts/"+host_id+"/search-queries/popular/?order_by=TOTAL_SHOWS&query_indicator=TOTAL_CLICKS", headers=headers) df2 = pd.DataFrame(columns=['query_id', 'clicks']) a = json.loads(res.text) for i in a['queries']: temp={} temp['query_id'] = i['query_id'] temp['clicks'] = i['indicators']['TOTAL_CLICKS'] df2 = df2.append(temp, ignore_index=True) #  500        res = requests.get(url+str(userId)+"/hosts/"+host_id+"/search-queries/popular/?order_by=TOTAL_SHOWS&query_indicator=AVG_SHOW_POSITION", headers=headers) df3 = pd.DataFrame(columns=['query_id', 'position']) a = json.loads(res.text) for i in a['queries']: temp={} temp['query_id'] = i['query_id'] temp['position'] = i['indicators']['AVG_SHOW_POSITION'] df3 = df3.append(temp, ignore_index=True) df1 = df1.merge(df2, on='query_id') df1 = df1.merge(df3, on='query_id') to_sql_server(df1, 'yandex_positions') 

Page Speed Insider


使您可以评估网站内容的下载速度。 如果网站开始加载的速度较慢,则这可能会大大降低网站在搜索结果中的位置。

 #  -50      conn = pymssql.connect(host=host,user=user,password=password) sql_string = r''' ;with a as( select distinct page, sum(clicks) as clicks from seo_google_positions group by page ) select top 50 page from a order by clicks desc ''' data = pd.read_sql(sql_string, conn) conn.close() #        dat = pd.DataFrame(columns=['first_cpu_idle', 'first_contentful_paint', 'page', 'dt']) for i, j in data.iterrows(): url = "https://www.googleapis.com/pagespeedonline/v5/runPagespeed?url={0}&category=performance&strategy=desktop".format(j[0]) res = requests.get(url) res = json.loads(res.text) temp = {} temp['first_cpu_idle'] = res['lighthouseResult']['audits']['first-cpu-idle']['displayValue'] temp['first_contentful_paint'] = res['lighthouseResult']['audits']['first-contentful-paint']['displayValue'] temp['page'] = j[0] temp['dt'] = now.strftime("%Y-%m-%d") dat = dat.append(temp, ignore_index = True) to_sql_server(dat, 'google_pagespeed') 

Google Ads和Yandex Direct


为了估算搜索查询的频率,我们卸载了SEO核心的频率。


Yandex预算预测


Google关键字规划师

Yandex公制


通过SEO流量上传有关视图和访问的数据,以登录页面。

 token = token headers = {"Authorization": "OAuth " + token} now = datetime.now() fr = (now - timedelta(days = 9)).strftime("%Y-%m-%d") to = (now - timedelta(days = 3)).strftime("%Y-%m-%d") res = requests.get("https://api-metrika.yandex.net/stat/v1/data/?ids=ids&metrics=ym:s:pageviews,ym:s:visits&dimensions=ym:s:startURL,ym:s:lastsignSearchEngine,ym:s:regionCountry,ym:s:deviceCategory&date1={0}&date2={1}&group=all&filters=ym:s:lastsignTrafficSource=='organic'&limit=50000".format(fr,to), headers=headers) a = json.loads(res.text) re = pd.DataFrame(columns=['page', 'device', 'view', 'dt_from', 'dt_to', 'engine', 'visits', 'country', 'pageviews']) for i in a['data']: temp={} temp['page'] = i['dimensions'][0]['name'] temp['engine'] = i['dimensions'][1]['name'] temp['country'] = i['dimensions'][2]['name'] temp['device'] = i['dimensions'][3]['name'] temp['view'] = i['metrics'][0] temp['visits'] = i['metrics'][1] temp['pageviews'] = i['metrics'][0] temp['dt_from'] = fr temp['dt_to'] = to re=re.append(temp, ignore_index=True) to_sql_server(re, 'yandex_pages') 

Power BI中的数据采集


让我们看看我们要卸载什么:

  • google_positions和yandex_positions
  • google_frequency和yandex_frequency
  • google_speed和yandex_speed
  • yandex_metrika

从这些数据中,我们将能够按周,按细分,按细分和请求的常规数据,按页面的动态和常规数据以及内容下载速度收集动态。 这就是最终报告的样子:



一方面,有很多不同的迹象,很难理解什么是总体趋势。 另一方面,每个印版都会显示有关位置,印象,点击次数,点击率,页面加载速度的重要数据。

周期中的文章:

Source: https://habr.com/ru/post/zh-CN467019/


All Articles