楽天モバイル（奈良県）の包括免許の差分をツイート

Python 楽天モバイル pandas

インストール pip install pandas pip install requests pip install tweepy プログラム # -*- coding: utf-8 -*- import csv import datetime import pathlib import urllib.parse import pandas as pd import requests import tweepy # Twitter consumer_k…

2021-06-06

無線局等情報検索（Web-API機能）を利用し基地局情報をデータラングリング

Python pandas 楽天モバイル

www.tele.soumu.go.jp 件数取得API/一覧取得APIのリクエスト条件一覧コード値一覧地方公共団体コード import urllib.parse import requests import pandas as pd d = { # 1:免許情報検索 2: 登録情報検索 "ST": 1, # 詳細情報付加 0:なし 1:あり "DA": 1, …

2020-10-24

今治市の保育所等の入所可能状況をマップ化

Python スクレイピング

今治市保育幼稚園課の保育所等の今治市受け入れ可能状況一覧のPDFの表をスクレイピング後地図に表示今治市オープンデータ一覧に保育園、認定こども園の住所と位置情報があるが全部ではなかったのでスプレッドシート作成スプレッドシート今治市内保育園（…

2020-01-17

Python整形オンライン

環境設定 Python

https://black.now.sh/ https://yapf.now.sh/

2019-08-06

Pandasでグループ別にファイル保存

Python

import pandas as pd df = pd.read_csv("2018.tsv", sep='\t') grouped_df = df.groupby("id") # print(grouped_df.groups) for id in grouped_df.groups: d = grouped_df.get_group(id) # ファイル名作成 filename = str(id) + ".tsv" # ファイル保存 d.to_…

2019-07-06

i.river.go.jpからダム情報と河川情報をスクレイピング

Python スクレイピング

2021/04/18現在利用できません import re import requests from bs4 import BeautifulSoup headers = { "User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64; Trident/7.0; rv:11.0) like Gecko" } def scraping(url, pattern): r = requests.get(url, head…

2019-05-12

JFLゴール数ランキング（GoogleDrive）

JFL Python pandas

imabari.hateblo.jp import csv import time from urllib.parse import urljoin import gspread import pandas as pd import requests from bs4 import BeautifulSoup from oauth2client.service_account import ServiceAccountCredentials from tqdm import…

2019-05-12

JFLランキング作成（GoogleDrive）

JFL Python pandas

pip3 install pandas pip3 install beautifulsoup4 pip3 install html5lib pip3 install lxml pip3 install gspread pip3 install oauth2clinet pip3 install tqdm # -*- coding: utf-8 -*- import gspread import pandas as pd import requests from bs4 im…

2019-01-18

スクレイピングのおけるCSSセレクタ基本

スクレイピング Python

サンプル <div class="class" id="id"> <h1>タイトル</h1> <h2>サブタイトル</h2> <p value="abc">テスト</p> <p value="abc def">テスト</p> <p value="abc-def">テスト</p> <ul> <li>1</li> <li>2</li> <li>3</li> </ul> </div> 基本書式説明サンプル * すべての要素 * 要素名要素名の要素 div .クラス名 id属性をつけた要素 div.class #id名 id属性をつけた要素 div＃id セレクタ同士の関係書式説明サンプ…

2018-12-11

Pythonでスクレイピングまとめ

スクレイピング Python

いまからスクレイピングをはじめるならrequests-htmlがおすすめ Requests-HTML: HTML Parsing for Humans (writing Python 3)! — requests-HTML v0.3.4 documentation imabari.hateblo.jp imabari.hateblo.jp imabari.hateblo.jp imabari.hateblo.jp imabari…

2018-12-05

Beautifulsoupでrowspan・colspanにデータ挿入

スクレイピング Python

import csv import requests from bs4 import BeautifulSoup url = 'https://www.pref.ehime.jp/h25115/kanjyo/topics/influ1819/tb_flu1819.html' # セルコピー True:空白、False:コピー flag = False headers = { 'User-Agent': 'Mozilla/5.0 (Windows NT …

2018-11-30

本文抽出

Python スクレイピング

kanji.hatenablog.jp github.com import time import requests from bs4 import BeautifulSoup from extractcontent3 import ExtractContent headers = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64; Trident/7.0; rv:11.0) like Gecko' } def scr…

2018-09-22

requests-htmlでスクレイピング

Python スクレイピング

requests-html https://html.python-requests.org/ !pip install requests-html !pip install retry 日経平均をスクレイピング from requests.exceptions import ConnectionError, TooManyRedirects, HTTPError from requests_html import HTMLSession from …

2018-08-23

json feed

Python

import requests import json url = 'http://www.city.seiyo.ehime.jp/kinkyu/index.update.json' headers = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64; Trident/7.0; rv:11.0) like Gecko' } r = requests.get(url, headers=headers) data = r…

2018-08-18

Beautifulsoupでスクレイピング

Python スクレイピング

Beautifulsoupの方がひとつのファイルですむのでやっぱり楽 import datetime import os import re import string import time from urllib.parse import urljoin import requests from bs4 import BeautifulSoup headers = { 'User-Agent': 'Mozilla/5.0 (Wi…

2018-08-18

Scrapyでスクレイピング

Python スクレイピング

普段Beautifulsoupしか使ってないのでScrapyで作成してみた github.com をベースに作成しました。 note.nkmk.me https://doc.scrapy.org/en/latest/topics/commands.html # インストール pip install scrapy # プロジェクト作成 scrapy startproject cheerup…

2018-08-13

玉川ダム・蒼社川（水位）・テレメーター（雨量）

Python スクレイピング

2021/04/18現在利用できません i.river.go.jpから正規表現で抽出 Twitterの文字数多くなったのでぎりぎりいけそう import re import time import requests import twitter from bs4 import BeautifulSoup headers = { 'User-Agent': 'Mozilla/5.0 (Windows N…

2018-08-07

ボランティア数集計

Python スクレイピング pandas

!pip install lxml !pip install seaborn !apt install fonts-ipafont-gothic !rm /content/.cache/matplotlib/fontList.json import pandas as pd url = 'https://ehimesvc.jp/?p=70' dfs = pd.read_html(url, index_col=0, na_values=['活動中止', '終了',…

2018-07-27

鹿野川ダムと野村ダムと肱川をスクレイピング

Python スクレイピング

2021/04/18現在利用できませんインストール pip install requests pip install python-twitter pip install apscheduuler pip install beautifulsoup4 プログラムを実行すると8,18,28,38,48,58分に表示コメントアウトしているTwitterのキーを入力すると投…

2018-07-25

玉川ダムと蒼社川の水位

Python スクレイピング

2021/04/18現在利用できません import datetime import requests import twitter from bs4 import BeautifulSoup # 文字を小数点に変換、変換できない場合は0.0 def moji_float(x): try: result = float(x) except: result = 0.0 return result # 空文字の場…

2018-07-16

愛媛県河川・水防情報よりダム情報をスクレイピング

スクレイピング Python

進捗状況を表示するように変更 import csv import datetime import time import requests from bs4 import BeautifulSoup from tqdm import tqdm def date_span(start_date, end_date, hour_interval): res = [] n = start_date while n < end_date: n += da…

2018-07-01

asyncioでスクレイピングを高速化

Python スクレイピング

import asyncio import aiohttp import requests from bs4 import BeautifulSoup async def scraping(url): async with aiohttp.ClientSession() as session: async with session.get(url) as response: html = await response.text() soup = BeautifulSoup(…

2018-07-01

ThreadPoolExecutorでスクレイピングを高速化

Python スクレイピング

from concurrent.futures import ThreadPoolExecutor import requests from bs4 import BeautifulSoup def scraping(url): r = requests.get(url) if r.status_code == requests.codes.ok: soup = BeautifulSoup(r.content, 'html5lib') result = [] return …

2018-06-22

PythonのrequestsでPOST送信スクレイピング

Python スクレイピング救急病院

今治地区の救急病院をスクレイピングし曜日別・医療機関別に集計する seleniumを使わずにrequestsでpost送信 Firefoxの開発ツールでpost内容を確認ネットワークの中からメソッドPOSTを選びパラメーターのフォームデータを確認 "blockCd[3]": "", "forward_n…

2018-06-12

JFLの試合結果分析

Python pandas スクレイピング

github.com

2018-06-10

今治の校区別人口を地図に表示

Map Python

今治市オープンデータ一覧 | 情報政策課オープンデータ | 今治市平成30年度の今治市の住民基本台帳人口統計 http://www.city.imabari.ehime.jp/opendata/toukei_jinkou/201805.xls 小・中学校 http://www.city.imabari.ehime.jp/opendata/data/school.csv …

2018-05-22

FC今治のゴール集計・先取点

Program Python pandas JFL

!pip install lxml !apt install fonts-ipafont-gothic !rm /content/.cache/matplotlib/fontList.json """再起動""" import time import csv import requests from bs4 import BeautifulSoup # 試合数 n = 10 + 1 with open('fcimabari_goal.tsv', 'w') as …

2018-05-17

PythonでスクレイピングしてテーブルをCSVに保存

スクレイピング pandas Python Program

Beautifulsoupの場合 import csv from bs4 import BeautifulSoup import requests url = 'http://www.example.com/' r = requests.get(url) if r.status_code == requests.codes.ok: soup = BeautifulSoup(r.content, 'html.parser') result = [[[td.get_tex…

2018-04-23

Pandasでミニロトで遊ぶ

Python Program pandas

!pip install lxml import pandas as pd import io import requests # User-agentを設定しないとダウンロードできない url = 'http://www.japannetbank.co.jp/lottery/co/minilotojnb.csv' headers = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64;…

2018-04-20

ミニロトの結果を取得

pandas Python

TSV保存 from selenium import webdriver from selenium.webdriver.firefox.options import Options from bs4 import BeautifulSoup import csv options = Options() options.set_headless() driver = webdriver.Firefox(options=options) driver.get( 'http…