Data Crawler Staff - Công ty Cổ phần Webify Group
-
Ho Chi Minh
-
Experienced (Non - Manager)
-
Permanent
-
Not required
-
Negotiable
-
IT - Software, IT - Hardware / Network
-
1
-
12/03/2025
Job Description
Tên đơn vị: Công ty Cổ phần Webify Group
Địa chỉ: Lầu 9, 19 Hồ Văn Huê, P9, Quận Phú Nhuận, HCM
Điện thoại: 0389614211
Website: https://infodoanhnghiep.com/thong-tin/Cong-Ty-Co-Phan-Webify-Group-05473.html
Mô tả công việc:
1. Professional Scraping System Development
Technical Requirements:
System Architecture:
- Build scalable systems
- Develop parallel crawling solutions
- Manage large, multi-threaded data streams
Technologies:
- Scrapy, BeautifulSoup
- Selenium
- Asyncio, Multiprocessing
- Proxy management
- IP rotation techniques
2. Data Processing and Normalization
Processing Methods:
- Develop API data cleaning processes
- Data transformation algorithms
- Integrity checks
- Remove noisy data
Tools:
- Pandas
- Data validation techniques
- Machine Learning preprocessing
3. Database Management
Specialized Skills:
Advanced SQL:
- Complex queries
- Performance optimization
4. Monitoring & Optimization
Strategy:
- Manage scraping system operations.
- Track scraping performance
- Challenge handling:
- IP blocking
- Speed limiting
- CAPTCHA
Thu nhập: 12.000.000 – 18.000.000 đồng /tháng
Địa điểm làm việc: Lầu 9, 19 Hồ Văn Huê, P9, Quận Phú Nhuận, HCM
Điều kiện làm việc:
- Working hours: HC 07 hours/day (Morning from 08:00 - 11:30, Afternoon from 13:00 - 16:30), from Monday to Friday, off on Saturday & Sunday.
- Working equipment: provided
Job Requirement
Yêu cầu ứng viên:
PROFESSIONAL REQUIREMENTS
Education
- Bachelor's degree (GPA > 3.0)
- Major: Data science; Computer engineering; Data related fields
- English: TOEIC > 700 or IELTS > 5.5
Technical Skills
Python Ecosystem
- Asyncio, Multiprocessing
- Data cleaning techniques
- Machine Learning preprocessing
- Advanced error handling
Database & Big Data
- SQL (Intermediate to Advanced)
- NoSQL database management
- PySpark
- Data warehousing
In-depth Experience
- Minimum 1-2 years
- Project implementation:
- Web scraping
- Automatic data processing
- Big data crawling
SOFT SKILLS
- System analysis
- Problem solving
- Independent & team working
- Time management
- Logical thinking
NICE TO HAVE EXPERIENCES
- Big Data experience
- Data pipeline design
- Working with diverse APIs
- Professional certifications
- Creativity and initiative in proposing ideas
Hồ sơ gồm:
- CV
Nộp hồ sơ:
- Hạn chót nhận hồ sơ: 12/03/2025
- Phương thức nộp: Nộp CV tại Cổng thông tin việc làm Trường Đại học Mở TP.HCM
Thông tin liên hệ:
- Cổng thông tin việc làm Trường Đại học Mở TP.HCM