Python compat.uses_yarn函数代码示例

OStack程序员社区-中国程序员成长平台 › 门户 › 编程› Python›Python编程经验

原作者: [db:作者] 来自: [db:来源] 收藏邀请

本文整理汇总了Python中mrjob.compat.uses_yarn函数的典型用法代码示例。如果您正苦于以下问题：Python uses_yarn函数的具体用法？Python uses_yarn怎么用？Python uses_yarn使用的例子？那么恭喜您, 这里精选的函数代码示例或许可以为您提供帮助。

在下文中一共展示了uses_yarn函数的12个代码示例，这些例子默认根据受欢迎程度排序。您可以为喜欢或者感觉有用的代码点赞，您的评价将有助于我们的系统推荐出更棒的Python代码示例。

示例1: _hadoop_log_dirs

    def _hadoop_log_dirs(self, output_dir=None):
        """Yield all possible places to look for hadoop logs."""
        # hadoop_log_dirs opt overrides all this
        if self._opts['hadoop_log_dirs']:
            for path in self._opts['hadoop_log_dirs']:
                yield path
            return

        hadoop_log_dir = os.environ.get('HADOOP_LOG_DIR')
        if hadoop_log_dir:
            yield hadoop_log_dir

        if uses_yarn(self.get_hadoop_version()):
            yarn_log_dir = os.environ.get('YARN_LOG_DIR')
            if yarn_log_dir:
                yield yarn_log_dir

        if output_dir:
            # Cloudera style of logging
            yield posixpath.join(output_dir, '_logs')

        for hadoop_dir in self._hadoop_dirs():
            yield posixpath.join(hadoop_dir, 'logs')

        # hard-coded log paths for EMR, so this can work out-of-the-box
        for path in _EMR_HADOOP_LOG_DIRS:
            yield path

开发者ID:sebratt，项目名称:mrjob，代码行数:27，代码来源:hadoop.py

示例2: _hadoop_log_dirs

    def _hadoop_log_dirs(self, output_dir=None):
        """Yield all possible places to look for hadoop logs."""
        # hadoop_log_dirs opt overrides all this
        if self._opts['hadoop_log_dirs']:
            for path in self._opts['hadoop_log_dirs']:
                yield path
            return

        hadoop_log_dir = os.environ.get('HADOOP_LOG_DIR')
        if hadoop_log_dir:
            yield hadoop_log_dir

        yarn = uses_yarn(self.get_hadoop_version())

        if yarn:
            yarn_log_dir = os.environ.get('YARN_LOG_DIR')
            if yarn_log_dir:
                yield yarn_log_dir

            yield _DEFAULT_YARN_HDFS_LOG_DIR

        if output_dir:
            # Cloudera style of logging
            yield posixpath.join(output_dir, '_logs')

        for hadoop_dir in self._hadoop_dirs():
            yield posixpath.join(hadoop_dir, 'logs')

        # hard-coded fallback paths
        if yarn:
            for path in _FALLBACK_HADOOP_YARN_LOG_DIRS:
                yield path

        for path in _FALLBACK_HADOOP_LOG_DIRS:
            yield path

开发者ID:okomestudio，项目名称:mrjob，代码行数:35，代码来源:hadoop.py

示例3: _interpret_task_logs

    def _interpret_task_logs(self, log_interpretation, partial=True):
        """Fetch task syslogs and stderr, and add 'task' to interpretation."""
        if 'task' in log_interpretation and (
                partial or not log_interpretation['task'].get('partial')):
            return   # already interpreted

        step_interpretation = log_interpretation.get('step') or {}

        application_id = step_interpretation.get('application_id')
        job_id = step_interpretation.get('job_id')
        output_dir = step_interpretation.get('output_dir')

        yarn = uses_yarn(self.get_hadoop_version())

        if yarn:
            if not application_id:
                log.warning("Can't fetch task logs; missing application ID")
                return
        else:
            if not job_id:
                log.warning("Can't fetch task logs; missing job ID")
                return

        log_interpretation['task'] = _interpret_task_logs(
            self.fs,
            self._ls_task_syslogs(
                application_id=application_id,
                job_id=job_id,
                output_dir=output_dir),
            partial=partial,
            stderr_callback=_log_parsing_task_stderr)

开发者ID:Dean838，项目名称:mrjob，代码行数:31，代码来源:mixin.py

示例4: _interpret_task_logs

    def _interpret_task_logs(self, log_interpretation):
        """Find and interpret the task logs, storing the
        interpretation in ``log_interpretation['task']``."""
        if 'task' not in log_interpretation:
            # get job/application ID from output of hadoop command
            step_interpretation = log_interpretation.get('step') or {}
            application_id = step_interpretation.get('application_id')
            job_id = step_interpretation.get('job_id')

            yarn = uses_yarn(self.get_hadoop_version())

            if yarn and application_id is None:
                log.warning("Can't fetch task logs without application ID")
                return {}
            elif not yarn and job_id is None:
                log.warning("Can't fetch task logs without job ID")
                return {}

            # Note: this is unlikely to be super-helpful on "real" (multi-node)
            # pre-YARN Hadoop because task logs aren't generally shipped to a
            # local directory. It's a start, anyways. See #1201.
            def stream_task_log_dirs():
                for log_dir in unique(
                    self._hadoop_log_dirs(
                        output_dir=step_interpretation.get('output_dir'))):

                    if yarn:
                        path = self.fs.join(
                            log_dir, 'userlogs', application_id)
                    else:
                        # sometimes pre-YARN attempt logs are organized by
                        # job_id,
                        # sometimes not. Play it safe
                        path = self.fs.join(log_dir, 'userlogs')

                    if self.fs.exists(path):
                        log.info('Scanning task syslogs in %s' % path)
                        yield [path]

            # wrap _ls_task_syslogs() to add logging
            def ls_task_syslogs():
                # there should be at most one history log
                for match in _ls_task_syslogs(
                        self.fs, stream_task_log_dirs(),
                        application_id=application_id, job_id=job_id):

                    # TODO: this isn't really correct because
                    # _interpret_task_logs() sorts the logs paths and
                    # scans starting at the most recent one. Probably
                    # should have _ls_task_syslogs() do the sorting.
                    log.info('  Scanning for errors: %s' % match['path'])
                    yield match

            log_interpretation['task'] = _interpret_task_logs(
                self.fs, ls_task_syslogs())

        return log_interpretation['task']

开发者ID:imtiaz39，项目名称:mrjob，代码行数:57，代码来源:hadoop.py

示例5: ls

    def ls(self, path_glob):
        components = urlparse(path_glob)
        hdfs_prefix = '%s://%s' % (components.scheme, components.netloc)

        version = self.get_hadoop_version()

        # use ls -R on Hadoop 2 (see #1152)
        if uses_yarn(version):
            args = ['fs', '-ls', '-R', path_glob]
        else:
            args = ['fs', '-lsr', path_glob]

        try:
            stdout = self.invoke_hadoop(args, return_stdout=True,
                                        ok_stderr=[_HADOOP_LS_NO_SUCH_FILE])
        except CalledProcessError:
            raise IOError("Could not ls %s" % path_glob)

        for line in BytesIO(stdout):
            line = line.rstrip(b'\r\n')

            # ignore total item count
            if line.startswith(b'Found '):
                continue

            fields = line.split(b' ')

            # Throw out directories
            if fields[0].startswith(b'd'):
                continue

            # Try to figure out which part of the line is the path
            # Expected lines:
            #
            # HDFS:
            # -rw-r--r--   3 dave users       3276 2010-01-13 14:00 /foo/bar
            #
            # S3:
            # -rwxrwxrwx   1          3276 010-01-13 14:00 /foo/bar
            path_index = None
            for index, field in enumerate(fields):
                # look for time field, and pick one after that
                # (can't use field[2] because that's an int in Python 3)
                if len(field) == 5 and field[2:3] == b':':
                    path_index = (index + 1)
            if not path_index:
                raise IOError("Could not locate path in string %r" % line)

            path = to_unicode(line.split(b' ', path_index)[-1])
            # handle fully qualified URIs from newer versions of Hadoop ls
            # (see Pull Request #577)
            if is_uri(path):
                yield path
            else:
                yield hdfs_prefix + path

开发者ID:Yelp，项目名称:mrjob，代码行数:55，代码来源:hadoop.py

示例6: mkdir

    def mkdir(self, path):
        version = self.get_hadoop_version()

        # use -p on Hadoop 2 (see #991, #845)
        if uses_yarn(version):
            args = ['fs', '-mkdir', '-p', path]
        else:
            args = ['fs', '-mkdir', path]

        try:
            self.invoke_hadoop(args, ok_stderr=[_HADOOP_FILE_EXISTS_RE])
        except CalledProcessError:
            raise IOError("Could not mkdir %s" % path)

开发者ID:Yelp，项目名称:mrjob，代码行数:13，代码来源:hadoop.py

示例7: _interpret_task_logs

    def _interpret_task_logs(
            self, log_interpretation, step_type, error_attempt_ids=(),
            partial=True):
        """Fetch task syslogs and stderr, and add 'task' to interpretation."""
        if 'task' in log_interpretation and (
                partial or not log_interpretation['task'].get('partial')):
            return   # already interpreted

        if not self._read_logs():
            return

        step_interpretation = log_interpretation.get('step') or {}

        application_id = step_interpretation.get('application_id')
        job_id = step_interpretation.get('job_id')
        output_dir = step_interpretation.get('output_dir')

        yarn = uses_yarn(self.get_hadoop_version())

        attempt_to_container_id = log_interpretation.get('history', {}).get(
            'attempt_to_container_id', {})

        if yarn:
            if not application_id:
                if not log_interpretation.get('no_job'):
                    log.warning(
                        "Can't fetch task logs; missing application ID")
                return
        else:
            if not job_id:
                if not log_interpretation.get('no_job'):
                    log.warning("Can't fetch task logs; missing job ID")
                return

        if _is_spark_step_type(step_type):
            interpret_func = _interpret_spark_task_logs
        else:
            interpret_func = _interpret_task_logs

        log_interpretation['task'] = interpret_func(
            self.fs,
            self._ls_task_logs(
                step_type,
                application_id=application_id,
                job_id=job_id,
                output_dir=output_dir,
                error_attempt_ids=error_attempt_ids,
                attempt_to_container_id=attempt_to_container_id,
            ),
            partial=partial,
            log_callback=_log_parsing_task_log)

开发者ID:Affirm，项目名称:mrjob，代码行数:51，代码来源:mixin.py

示例8: rm

    def rm(self, path_glob):
        if not is_uri(path_glob):
            super(HadoopFilesystem, self).rm(path_glob)

        version = self.get_hadoop_version()
        if uses_yarn(version):
            args = ['fs', '-rm', '-R', '-f', '-skipTrash', path_glob]
        else:
            args = ['fs', '-rmr', '-skipTrash', path_glob]

        try:
            self.invoke_hadoop(
                args,
                return_stdout=True, ok_stderr=[_HADOOP_RM_NO_SUCH_FILE])
        except CalledProcessError:
            raise IOError("Could not rm %s" % path_glob)

开发者ID:Yelp，项目名称:mrjob，代码行数:16，代码来源:hadoop.py

示例9: _find_probable_cause_of_failure

    def _find_probable_cause_of_failure(self, application_id=None, job_id=None,
                                        output_dir=None, **ignored):
        """Find probable cause of failure. Currently we just scan task logs.

        On YARN, you must set application_id, and pre-YARN, you must set
        job_id.
        """
        # package up logs for _find_error_intask_logs(),
        # and log where we're looking.
        hadoop_version = self.get_hadoop_version()
        yarn = uses_yarn(hadoop_version)

        if yarn and application_id is None:
            log.warning("No application ID!")
            return None

        if not yarn and job_id is None:
            log.warning("No job ID!")
            return None

        # Note: this is unlikely to be super-helpful on "real" (multi-node)
        # pre-YARN Hadoop because task logs aren't generally shipped to a local
        # directory. It's a start, anyways. See #1201.
        def stream_task_log_dirs():
            for log_dir in unique(
                    self._hadoop_log_dirs(output_dir=output_dir)):

                if yarn:
                    path = self.fs.join(log_dir, 'userlogs', application_id)
                else:
                    # sometimes pre-YARN attempt logs are organized by job_id,
                    # sometimes not. Play it safe
                    path = self.fs.join(log_dir, 'userlogs')

                if self.fs.exists(path):
                    log.info('looking for logs in %s' % path)
                    yield [path]

        return _find_error_in_task_logs(
            self.fs, stream_task_log_dirs(), hadoop_version,
            application_id=application_id, job_id=job_id)

开发者ID:BeeswaxIO，项目名称:mrjob，代码行数:41，代码来源:hadoop.py

示例10: mock_hadoop_uses_yarn

def mock_hadoop_uses_yarn(environ):
    return uses_yarn(environ['MOCK_HADOOP_VERSION'])

开发者ID:kodizant，项目名称:mrjob，代码行数:2，代码来源:mockhadoop.py

示例11: test_uses_yarn

 def test_uses_yarn(self):
     self.assertEqual(uses_yarn('0.22'), False)
     self.assertEqual(uses_yarn('0.23'), True)
     self.assertEqual(uses_yarn('0.24'), True)
     self.assertEqual(uses_yarn('1.0.0'), False)
     self.assertEqual(uses_yarn('2.0.0'), True)

开发者ID:kartheek6，项目名称:mrjob，代码行数:6，代码来源:test_compat.py

示例12: _find_error_in_task_logs

def _find_error_in_task_logs(fs, log_dirs_stream, hadoop_version,
                             application_id=None, job_id=None):
    """Given a filesystem and a stream of lists of log dirs to search in,
    find the last error and return details about it. *hadoop_version*
    is required, as task logs have very different paths in YARN.

    In YARN, you must set *application_id*, and pre-YARN, you must set
    *job_id*, or we'll bail out and return None.

    Returns a dictionary with the following keys ("optional" means
    that something may be None):

    syslog: dict with keys:
       path: path of syslog we found error in
       error: error details; dict with keys:
           exception: Java exception (as string)
           stack_trace: array of lines with Java stack trace
       split: optional input split we were reading; dict with keys:
           path: path of input file
           start_line: first line of split (0-indexed)
           num_lines: number of lines in split
    stderr: optional dict with keys:
       path: path of stderr corresponding to syslog
       error: optional error details; dict with keys:
           exception: string  (Python exception)
           traceback: array of lines with Python stack trace
    type: always set to 'task'
    """
    syslog_paths = []

    yarn = uses_yarn(hadoop_version)

    if ((yarn and application_id is None) or (not yarn and job_id is None)):
        return None

    # we assume that each set of log paths contains the same copies
    # of syslogs, so stop once we find any non-empty set of log dirs
    for log_dirs in log_dirs_stream:
        if yarn:
            syslog_paths = _ls_yarn_task_syslogs(fs, log_dirs,
                                                 application_id=application_id)
        else:
            syslog_paths = _ls_pre_yarn_task_syslogs(fs, log_dirs,
                                                     job_id=job_id)

        if syslog_paths:
            break

    for syslog_path in syslog_paths:
        log.debug('Looking for error in %s' % syslog_path)
        syslog_info = _parse_task_syslog(_cat_log(fs, syslog_path))

        if not syslog_info['error']:
            continue

        # found error! see if we can explain it

        # TODO: don't bother if error wasn't due to child process
        stderr_path = _stderr_for_syslog(syslog_path)

        stderr_info = _parse_python_task_stderr(_cat_log(fs, stderr_path))

        # output error info
        syslog_info['path'] = syslog_path
        stderr_info['path'] = stderr_path

        return dict(type='task', syslog=syslog_info, stderr=stderr_info)

    return None

开发者ID:parastoo-62，项目名称:mrjob，代码行数:69，代码来源:interpret.py

注：本文中的mrjob.compat.uses_yarn函数示例由纯净天空整理自Github/MSDocs等源码及文档管理平台，相关代码片段筛选自各路编程大神贡献的开源项目，源码版权归原作者所有，传播和使用请参考对应项目的License；未经允许，请勿转载。

鲜花

握手

雷人

路过

鸡蛋

该文章已有0人参与评论

请发表评论

全部评论

专题导读

More+

10-27 六六分期app的软件客服如何联系？(六六分期

11-06 可心卡盟:win10系统火狐flash插件崩溃怎么

11-06 亲亲特价:怎么删除回收站图标

11-06 济南大学虚拟社区:鲁大师节能降温的具体办

11-06 xlueops.exe:无线网络安装向导

11-06 女斗合众国:win7系统cf与主机连接不稳定怎

11-06 0xc000022-[cf烟雾头]cf怎么调烟雾头

11-06 qizideyouhuo:应用程序无法正常启动0xc0000

11-06 ipz-185:win7系统vcf文件怎么打开

11-06 傻哥蹦迪:win10系统s4怎么打开usb调试

11-06 八神浩树gtaste:回收站清空了怎么恢复

11-06 妖尾之黑色守护:win10系统电脑没有1440x900

11-06 校园至尊魔王小说:win7系统浏览网页时字体

11-06 女斗合众国:win10系统访问共享文件夹提示请

11-06 tokyo hot n0654:恢复win7系统默认字体一招

11-06 雨酷仙境:设置win7系统转移临时文件夹腾出

11-06 阿穆纳伊之杖:win7系统开始菜单在右边还原

11-06 tunespotting:win10系统火狐flash插件总是

11-06 甘尔葛分析师：计谋网站seo关键词暴涨有什

11-06 蔡贵霖: 计谋网站seo关键词暴涨有什么秘密

11-06 博益网首页:ao3网页版进入不了解决方法

11-06 漏斗子专栏: 网站数据分析小白易懂精华篇

11-06 见证双虹怎么做:win7系统开启telnet命令的

11-06 颾狐蝶蜋:系统资源不足无法完成请求的服务

11-06 国光中学校歌:提交网站到alexa查询详细步骤

11-06 西安有情天:静态网页和动态网页的区别

11-06 红木雅尚斋:外部链接构造对网站的好处

11-06 前官礼遇：防止域名劫持–增强域安全性的10

11-06 密传二转答案: 中文分词算法有哪些

11-06 金泉家园邮编:百度快照劫持的表现及应对方

Python conf.combine_cmds函数代码示例发布时间：2022-05-27

Python compat.translate_jobconf函数代码示例发布时间：2022-05-27

Python util.grid_equal函数代码示例

1 Python 入门教程

Python入门教程 Python 是一种解释型、面向对象、动态数据类型的高级程序设计语言。 P

阅读：13931|2022-01-22

2 Python wikiutil.getFrontPage函数代码示例

Python wikiutil.getFrontPage函数代码示例

阅读：10292|2022-05-24

3 Python 简介

Python 简介 Python 是一个高层次的结合了解释性、编译性、互动性和面向对象的脚本

阅读：4169|2022-01-22

4 Python tests.group函数代码示例

Python tests.group函数代码示例

阅读：4064|2022-05-27

5 Python util.check_if_user_has_permission

Python util.check_if_user_has_permission函数代码示例

阅读：3889|2022-05-27

6 Python 操练实例98

Python 练习实例98 Python 100例题目：从键盘输入一个字符串，将小写字母全部转换成大

阅读：3539|2022-01-22

7 Python 环境搭建

Python 环境搭建本章节我们将向大家介绍如何在本地搭建 Python 开发环境。 Py

阅读：3067|2022-01-22

8 Python 基础语法

Python 基础语法 Python 语言与 Perl，C 和 Java 等语言有许多相似之处。但是，也

阅读：2726|2022-01-22

9 Python output.darkgreen函数代码示例

Python output.darkgreen函数代码示例

阅读：2682|2022-05-25

10 Python 中文编码

Python 中文编码前面章节中我们已经学会了如何用 Python 输出 Hello, World!，英文没

阅读：2346|2022-01-22

客服电话

电子邮件

Python compat.uses_yarn函数代码示例

示例1: _hadoop_log_dirs

示例2: _hadoop_log_dirs

示例3: _interpret_task_logs

示例4: _interpret_task_logs

示例5: ls

示例6: mkdir

示例7: _interpret_task_logs

示例8: rm

示例9: _find_probable_cause_of_failure

示例10: mock_hadoop_uses_yarn

示例11: test_uses_yarn

示例12: _find_error_in_task_logs

请发表评论

全部评论

上一篇：

下一篇：

Python util.grid_equal函数代码示例

Python util.get_worker_name函数代码示例

Python util.get_webmention_target函数代

Python util.get_uuid函数代码示例

Python util.get_type_by_name函数代码示例

Python util.grid_equal函数代码示例

Python util.get_worker_name函数代码示例

Python util.get_webmention_target函数代

Python util.get_uuid函数代码示例

Python util.get_type_by_name函数代码示例

Python util.get_stdout函数代码示例

关于我们

产品与服务

解决方案

139-2527-9053